## page was renamed from EasyGpg2016/Validity
<<TableOfContents>>

= Automated Encryption =

The idea behind automated encryption is simple: enable encryption without (much) user support.
The goal is to not just protect users from passive attackers (e.g., someone listening), but also
protect users from man-in-the-middle attacks and forgeries.  At least, as much as possible
without requiring much user support.  In particular, we should only require help from the user
if there is a good chance that she is being attacked.

This page is intended as a discussion base for validity display and
opportunistic mail encryption and how to use the trust-model tofu+pgp
for automated encryption.

== What are our goals and how do we archive them? ==

* Prevent mass surveillance:

It's not difficult to imagine that all clear text emails are saved by
many governments and immediately analyzed.  By encrypting mail by
default whenever possible, we dramatically increase the cost of this
type of surveillance.  For instance, the government would have to interfere
with key discovery (similar to how
[[https://www.eff.org/de/deeplinks/2014/11/starttls-downgrade-attacks|Verizon inhibited transport level security over SMTP]])
to prevent users from learning about their communication partners' keys.

Solution: way to find a reasonable key without user's help. (See: WKD / WKS )

* Spam phishing:

Phishing is a common type of fraud.  A simple example is an email that
is apparently from your bank prompting you to take some action that
requires you to log in.  The link to the log-in site is actually to
a site controlled by the attacker, which steals your credentials.
Another example of such an attack is a mail containing malware in
an attachment.

This type of attack can be prevented by using signatures to verify
the sender's address.  Since we don't require users to actively
authenticate their communication partners, preventing this type
of attack requires recognizing that the sender is attempting an
impersonation.

Solution: there are two ways to detect this type of attack.

First, phishing attacks are successful, because the mail content
and headers look authentic.  That is, an integral part of phishing is
imitating a person or institution to trick the mark.
A common technique is to an email address that is a
homograph of the real email address, e.g., using a cyrillic a in place
of a latin a.  [[https://cloud.googleblog.com/2014/08/protecting-gmail-in-global-world.html|Google]]
detects these types of phishing attacks by checking that email addresses
conform to unicode's
[[http://www.unicode.org/reports/tr39/#Restriction_Level_Detection|highly restricted]] designation level.  We could do something similar and show
a warning if an email address doesn't pass this test.

Second, if we assume that the user will regularly receive signed emails
from her bank, then we can exploit the communication history to show
that signed messages from previously unseen / rarely seen email addresses
shouldn't be trusted.  This requires vigilance on the part of the user
to realize that the message didn't verify, but should have.  It also
requires that the user be educated.  Further, if
a spammer uses the same email address & key many times, the email address
may eventually appear to be trustworthy using this metric.

* Targeted (spear) phishing or CEO-Fraud:

This attack is similar to the spam phishing described above, but
the stakes are higher.  An example of this type of attack is when
an assistant receives an email allegedly from the CEO requesting
that the assistant immediately transfer some funds to a particular
account.  Unlike the above attack, in this case, the victim is
targeted, and the potential monetary damage much higher.

Solution: again, automated techniques or the use of history cannot mitigate
this attack; the employee must be trained to recognize certain signals.
A possible mitigation is to have a list of fully trusted keys, and
show messages that are signed with these keys differently.  Note: this
doesn't mean that the employee must necessarily curate this list;
this can be done by the IT department.

* Man in the Middle attacks:

A Man-in-the-Middle (MitM) attack is when an adversary is actively
decrypting and re-encrypting email.  For this to work, the MitM
must 1) get Alice and Bob to use keys that he controls and
2) re-encrypt every communication to avoid detection.

To get Alice and Bob to use keys that he controls, the MitM
must intervene during the initial key discovery.  In this case,
we can detect the MitM attack when a valid message eventually
gets through.  This could occur if Alice receives a message via a
channel that the attacker doesn't control.

If a good message gets through and is encrypted, Alice will be unable to
decrypt it and she will probably tell Bob that something went wrong.
Most likely, Alice and Bob will not be sufficiently
technically savvy to diagnose the actual problem.  Since everything
will work when they use their usual communication channel, they will
ignore the issue.  To actually detect a conflict in this situation,
Alice's MUA could fetch all of the keys specified in the PK-ESK
packets.  This may allow Alice's gpg to detect a conflict: normally
the sender's MUA encrypts emails to not just the recipients, but
also the sender so that the sender can later review the sent message.
Note: this scenario can occur due to a misconfiguration, e.g.,
the message is not encrypted to Alice.

If the message is only signed, then when we fetch the key used to
sign the message, we will detect a conflict.  In this case, we
can prompt Alice to contact Bob to figure out what the right key is,
which gives us a chance of defeating the man-in-the-middle.

Note: if Bob proactively
sends a message to Alice, then he will (hopefully) access Alice's
key via an authenticated key stored, such as WKD, in which case,
the attack would have to break the store's protection (e.g., TLS)
to make sure Bob gets the attacker's key.  On the other hand, if
the attacker sends a forged message to Bob, and Bob just
downloads the specified key from the key server, then the attacker
has successfully intervened.  This suggests that we should always
check WKD for the right key, if possible.

If the MitM attempts to intervene after Alice and Bob have already
successfully communicated, e.g., by sending Bob a forged message,
then we can detect the MitM due to the conflict and
we can prompt the users to exchange fingerprints to figure out the
right key.

* Forensic detection of attacks:
If an attack happened it will typically be possible to detect
what happened (which messages were read, and which messages were
sent) after the fact based on the state maintained by gpg.
(See limitations below)

== Limitations of automation ==

When communication partners authenticate each other over
a secure channel, e.g., exchanging business cards in person,
a MitM cannot hijack the channel without the communication
partners knowing.  An automated system doesn't normally 
have access to a secure channel.  But, there are different levels of
insecure.  For instance, if the initial key discovery is
done over TLS to an accountable entity (i.e., something like WKD),
this makes a MitM attack significantly more expensive than just looking
up a key from a key server.  But given that adversaries
capable of monitoring all channels are probably capable of
 [[https://freedom-to-tinker.com/2015/10/14/how-is-nsa-breaking-so-much-crypto/|circumventing TLS]]
or at least compelling mail providers to compromise their WKD using
tools like National Security Letters (NSLs), WKD should not
be regarded as completely authoritative.

To protect against more sophisticated attackers, an automated system
can be combined with something like the web of trust.  To be usable,
the user must be able to distinguish these different trust levels,
i.e., the cases where we have reason to believe that the connection
is secure, and the cases where we know the connection is secure.
(In the system described below, levels 1 & 2 correspond to a
connection secured using the automated approach, and levels 3 & 4
correspond to a connection that was bootstrapped over a secure
channel.)

We also need to distinguish between keys that are automatically
looked up and keys that the user explicitly looks up.  WKD works
by identifying a key that belongs to an email address.  If the
user explicitly enters an email address (e.g., when encrypting),
then we know that the user intended that email address.  On the other
hand, if we fetch a key via WKS in order to validate a signed message (e.g.,
we use WKS to find the key for phishy@examp1e.org), then we should
be less confident that the key is trustworthy, and that the message
is authentic.

= Details =

== Trust Levels ===

Definitions of wording:

userid: The userid on a key that matches the smtp address of the sender.
userid.validity is set as follows.  By default, it is marginal.  If the key is fully
trusted via the web of trust, then it is Fully.  If the key is explicitly marked
as bad or unknown, then it is Never or Unknown, respectively.

tofu: Information we have about the communication history and reflects a bit
the A~P~I of gpgme_tofu_info_t. As tofu info describes key + userid pairs,
this is also sometimes called "binding".

key source: The source where the key was imported from, e.g. if it was automatically
imported over https, or if it comes from public keyservers.

Key with enough history for basic trust:

=== Level 0 ===

Defined as:
{{{
(userid.validity <= Marginal AND
 tofu.signcount == 0 AND
 tofu.enccount == 0 AND
 key.source NOT in [cert, pka, danke, wkd])
}}}

Explanation:

This level is assigned to keys that were never used
to verify a signature, never used for an encryption,
and not obtained by a source that gives some basic
indication that this key is actually controlled by the
allegedly owner.  That is, we have no evidence that the
user can actually decrypt messages encrypted with this key.
Consequently, this key should not be automatically used
to encrypt a messages to this recipient.

Usage:

* Encryption: Do not automatically encrypt to Level 0.
* Validation: N/A

=== Level 1 ===

Defined as:
{{{
((userid.validity == Marginal AND
  tofu.validity < "Key with enough history for basic trust") AND
  key.source NOT in [cert, pka, dane, wkd]):
}}}

Explanation:

We have verified at least one message signed using this key,
or encrypted at least one message to this key, but not more
than a handful, and we didn't find the key via a convincing
source (e.g., WKD).

This level means that there is some evidence that the recipient
actually controls this key.  Thus, it is okay to automatically
encrypt to this recipient using this key.  (Note: if there is
a MitM, then the intended recipient will still be able to
read the message so this decision will not negatively impact
usability.)

We don't, however, have that much evidence that the key belongs
to the stated person.  For instance, the key could have been
imported when the user examined a phishing mail.  As such,
we should not indicate to the user that the contents of the
messages are in anyway valid.

Usage:

* Encryption: Automatically encrypt.
* Validation: Display the message as if it wasn't signed.

=== Level 2 ===

Defined as:
{{{
(userid.validity == Marginal AND
 ((tofu.validity >= "Key with enough history for basic trust") OR
  key.source IN [cert, pka, dane, wkd])):
}}}

Note: validity is computed based on the number of days
on which the user verified a message / encrypted a message,
not the total number of verified messages / encrypted messages.

Explanation:

We have verified a bunch of signatures at different times
from this key and/or encrypted messages to this key a bunch
of times; or, the key came from a semi-trusted source.

Given the evidence that the key is actually controlled
by the stated person, it makes sense to both encrypt to
this key and to show that the message is signed.

Note: a common phisher is unlikely to take the time to get
to this level.  Thus, non-targetted phishing mails will
normally not show up as trusted.


Usage:

* Encryption: Automatically encrypt
* Validation: Show that the message was signed

=== Level 3 ===

Defined as:
{{{
userid.validity == Full AND
"no key with ownertrust ultimate signed this userid."
}}}

Explanation:

The user is fully trusted, but only indirectly so, i.e.,
WoT, or the TOFU policy was set to good.

Level 3 and level 4 are indications that the communication
partner is authenticated.  The distinction between levels
3 and 4 is to provide flexibility for Organizational Measures like:
You should only send restricted documents to certain keys.
This can be realized by having a key be an organization key
be ultimately trusted and responsible for signing those keys.

This is also the level for S/M~I~M~E Mails.

Usage:

* Encryption: Automatically encrypt to Level 3
* Validation: Indicate that the message came from the sender.

=== Level 4 ===

Defined as:
{{{
userid.validity == ultimate OR
(userid.validity == full AND
 "any key with ownertrust ultimate signed this userid.")
}}}

Explanation:

The key is either ultimately trusted or signed by an ultimately
trusted key.

See level 3 for an explanation of this level.

Usage:

* Encryption: Automatically encrypt to level 4
* Validation: Show verified messages as "the best". Stars and sprinkle level ;-)

=== Rationale ===

==== Time delay for level 2 ====

Using the number of days on which we saw signed messages rather than
the total number of messages that we saw to determine a key's
"believability" makes it more expensive for a simple attack to
work.  For instance, a phisher might first send a bunch of
signed spam so that if the user opens them, the key will be marked
as valid (level 2).  But, it is much more expensive for the attacker to retain
state (the key, how many messages were sent, etc.), then just firing
and forgetting.

A time delay also gives others the chance to intervene if they detect
an attack, e.g., if it is against a whole organization.

==== HTTPS Trust as shortcut to level 2 ====

Using H~T~T~P~S for key discovery will automatically bring a key
to level 2 because in that case we have a claim by some authenticated
source that this key really belongs to the according mail address.

If an attacker controls your H~T~T~P~S there are very likely cheaper and
less detectable attacks on your communication then intercepting pgp encryption.
e.g. compromising your system.

It's also harder to break H~T~T~P~S compared to S~M~T~P~S/I~M~A~P~S because
every M~U~A offers to ignore certificate errors (which dirmngr does not) and
a compromised router could claim that your M~S~P only offers S~M~T~P / I~M~A~P
without encryption.

== Presentation ==

There should only be prominent information when reading a signed mail if:

* There is additional information that the sender is really is the intended
  communication partner. (Level >= 2)

This could be displayed as a checker or a seal ribbon or something. It should
be prominent and next to the signed content. There should be a distinction
between Levels two, three and four but it may be slight.

=== Don't treat signed mails worse then an unsigned mail ===

A MUA should not treat any signed mail worse then an unsigned mail. If a
sender is not verified it should be displayed similar to an unsigned mail because
in both cases you have no information that the Sender is actually your intended
Communication partner. You may want to show a tofu Conflict more prominent as
user interaction is required at this point.

Especially: **ignore GPGME's Red suggestion** An attacker would have removed
the signature instead of invalidating it. It should be treated like an
unsigned mail and only additional info in details should be shown for
diagnostic purposes. Similarly when Red is set because a key is expired
or so. It's not more negative then a unsigned mail so only if your
M~U~A shows unsigned mails as "Red" may you treat signed mails this
way, too ;-)

== Conflict handling ==

A TOFU conflict occurs when there are multiple keys with the same
mailbox and it is not possible to automatically determine which ones
are good.  For instance, if two keys are cross-signed, then they are not considered
to conflict; this is just a case of the user rotating her primary key.

Conflicts occur in two situations:

Attacks:

# A MitM controlled the initial key exchange.  If a good message gets through, there will be a conflict.  The "new" key is the correct key.
# An attacker attempts a MitM attack, but the user already has the right key.  The "old" key is the correct key.
# An attacker sends a forged message.  The "old" key is the correct key, or both are bad (the first key was also due to a forgery).
# There is a Troll trying to hurt usability so much that automated encryption is no longer used (i.e., many forgeries resulting in gratuitous conflicts)

Misuse:

# A user generated two keys e.g. on two devices and did not
cross sign them and uses both.
# A user lost control of his old key, and did not have a revocation
certificate.

Both misuse cases should be handled on the senders side because
he controls or lost control of the involved keys and can take steps /
inform himself what went wrong.

The second misuse case (inaccessible key) is likely more common
than the first misuse case (multiple, valid keys).
The first misuse case already leads to problems:
communication partners need to choose which key to use.

Losing keys can also be assisted by software that provides a bad
user experience or does not follow common practices.

=== Resolving conflicts on the senders side ===

When an application makes a signature, it should check
that there are no other keys with the same user id.
If there are and the private key material is available,
then the application should prompt the
user to make a cross signature

Similarly, if a signature is verified by a MUA, there are secret
keys available with the same userid, and they are not cross signed,
the user should be prompted to make a cross signature.

==== WKD Checks ====

A tofu aware ~G~U~I should check when signing or from time to time
if a different key is uploaded to a WKD for this userid and warn
in that case. This will also make a permanent man in the middle
attack by a mail service provider more expensive as it would
mean providing a different key to the attacked user then to
others.

==== Automated conflict resolution by recipients ====

Let us assume that there are two keys, K1 and K2, with the
email address alice@example.org, and that they are in conflict.
K1 is a key with established communication history, K2 is
a key without history. But we either see a message signed
from alice@example.org with K2 or fetch K2 through the
Web Key Directory of example.org.

In this case, we cannot with certainty resolve the conflict
(otherwise, we would have resolved it automatically!).
Consider:

  - If we discover K2 because a good message or WKD access
    finally got through a MitM attack, then K2 is the
    correct key.

  - If we discover K2 due to a forgery, then K1
    is the correct key.

In fact, the "correct key" could also
be bad!  For instance, if the user never interacted
with Alice and got two different forgeries, then
neither key is the correct one but when both keys are
bad our conflict resolution does not harm. We are in
the case where an attacker controls all your communication
and does not care about detection. This attack is out
of scope for automated encryption.

In case K1 is provided by WKD we can accept K1 as still
valid because it's still publicly available and we can
assume that a conflict would have been detectable by
the sender.

If we don't have a WKD we still want to use K1 for
encryption but don't show it as valid anymore.

We don't want to autoresolve to K2, because we would have
two good keys and something is wrong. If a user has lost control / lost
his old key and is unable to revoke it we want to create problems for the sender,
so that we can get notified by the sender that the new key should be used and
we can mark the old key explicitly as bad.

If K2 is available in the Web Key Directory we also don't
want to autoresolve to it, because it would make an attack
from the Mail Service Provider too easy. Imagine the MSP
wants (or is forced) to intercept some specific communication
he could ensure that for a specific communication partner a M~I~T~M key
(K2) is provided. When we automatically
fetch that key through WKD and automatically accept K2 as
valid the attacker would have reached its goals. So we
stay with K1 for encryption but don't show K1 as valid anymore
to make it detectable that there is a problem.


If a WKD is not available and we assume that at least one
of the keys is probably good, then in all of the situations
outlined above except for one (a good message gets through
a MitM attack), then the old key is the correct key.

If we consider the good message from Bob getting through
a MitM attack, we find that there are two situations: the good
message is signed, and the good message is signed and encrypted.

If the message is signed and encrypted, then Alice will be
unable to decrypt it (because the MitM didn't reencrypt it),
and we won't actually see a conflict, because we never
see the signature and consequently we never see a conflict.
The failing-to-decrypt message, however, will likely
cause Alice to talk to Bob.  But, it is unlikely that
they will correctly diagnose the problem, in particular, as
once the MitM is back in control, everything will continue
to work.  Thus, they will likely conclude that there was
a misconfiguration or a bug.

If the message is only signed, then Alice fetches the key
used to sign the message (since she hasn't see it before)
and we detect a conflict.

We conclude that the above scenario (MitM + good signed message getting
through) is sufficiently rare, so we'll automatically
decide for old key if there is a conflict. But if the old
key is not available through WKD anymore we show the verification
status as being "not trusted" (i.e., level 1) which might
spur the user to try and figure out why mails from that
particular user are no longer marked as verified. So the
rare MitM + good signed (but not encrypted) message is getting
through will still be detectable because they won't be shown
as "green" / valid anymore.


Pseudocode:
{{{
    if (K1.source == WKD)
      {
        K1.validity = Level 2 (Tofu.validity == Basic Trust)
        K1.policy = auto
        K2.validity = Level 0 (Tofu.validity == bad)
        K2.policy = bad
        encrypt_to = K1
      }
    else
      {
        K1.valididy = Level 1(Tofu.validity == conflict)
        K1.policy = ask
        K2.validity = Level 1 (Tofu.validity == conflict)
        K2.policy = ask
        encrypt_to = K1
      }
}}}

=== Does this model protect against our threats? ===

* Man in the middle with full control of communication history:
If Alice and Bob always encrypt & sign but Mallory had full control of
their communication history and re-encrypts / resigns the messages regularly
this attack could be detected:

a) When a Mail gets trough once e.g. if Mallory controls Alice's router
and Alice uses an Internet Cafe once. In that case Bob would not be able
to decrypt the message. That communication failure could lead to more
investigation. A MUA may assist that by showing to which keys a mail was
encrypted in case decryption fails.

b) If Alice publishes her key in a Web key directory Bob's MUA can detect
that the key used for Alice does not match the one he always used and can
signal this by not showing Alice's signatures as "Good" indicating that
the communication is not protected.

The Attack is more problematic if Alice and Bob don't encrypt but just
sign. In case a) this would mean that the mails that get through from
the real Alice would not be shown as valid. But once a mail gets through
from the real Alice the messages from both Alice and Mallory will not
be shown as valid anymore, indicating that the communication is not
secure.

* Man in the Middle attack with established communication:

This attack will be prevented by keeping the established key in use so messages
won't get encrypted to Mallory. By not showing the validity indication anymore
this attack is also detectable.

* Impostor attack: would still be prevented because the new key is never shown
as valid / verified unless user interaction is done.

* The Troll Attack will be prevented as conflicts are not shown so prominent. Using
a Web key directory can also mitigate this attack because it will prevent marking
many keys as bad trough spam.

== Key discovery and Opportunistic Encryption (Mail only) ==

A MUA should offer automated key discovery and opportunistic encryption.
The WKD / WKS helps with automated key discovery and should be used (by using {{{--locate-key}}})

To determine if a mail can be sent automatically encrypted:

* Is there a key of at least level 1 for each recipient of the mail.

=== Auto Key Retrieve ===

Additionally to WKD lookup, if you receive a message from an Unknown Key a MUA
should automatically retrieve it from a public keyserver or a Web Key directory.
This Key can then be used for opportunistic encryption because you have seen
a signature it is very likely that the recipient can decrypt.
(auto-key-retrieve in gnupg)

== Example GpgOL ==

[[EasyGpg2016/OutlookUi|Example screenshots / UX design from GpgOL]]