Bug 738247 – [SMTP] Do not use local host name for HELO/EHLO

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 738247 - [SMTP] Do not use local host name for HELO/EHLO


Summary:	[SMTP] Do not use local host name for HELO/EHLO


Status:	RESOLVED OBSOLETE

Product:	evolution-data-server
Classification:	Platform
Component:	Mailer
Version:	3.24.x (obsolete)
Hardware:	Other Linux

Importance:	High normal
Target Milestone:	---
Assigned To:	evolution-mail-maintainers
QA Contact:	Evolution QA team

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2014-10-09 16:42 UTC by Eugene Kanter
Modified:	2021-05-19 11:03 UTC

See Also:
GNOME target:	---
GNOME version:	---

Description Eugene Kanter 2014-10-09 16:42:41 UTC

Similar to bug 702703 but this time SMTP handshake is leaking unwanted information.

Thunderbird:

Received: from [local IP] (external IP) # external IP format is MTA dependent.

Evolution:

Received: from SENDER_FQDN (external IP) # external IP format is MTA dependent.

Above samples produced by same SMTP server. The "local IP" is resolvable on local network only and is located behind a NAT.

In general, many SMTP servers will try resolve supplied connection address:

Received: from REVERSE_DNS (REVERSE_DNS [external IP])

or use "unknown" in its place if reverse DNS is not available.

Since above SENDER_FQDN is not resolvable by SMTP server it must have come from the evolution.

Comment 1 Eugene Kanter 2017-09-14 17:11:05 UTC

This is still the case. Local host name shows up in recipient email headers.
Sent from evolution:
Received: from [external IP] ([external IP:port] helo=myhost.mydomain.ext)

Sent from thunderbird:
Received: from [external IP] ([external IP:port] helo=[local IP])

helo message must be changed to avoid local host name disclosure.

Comment 2 André Klapper 2017-09-23 17:08:30 UTC

Which specific problems are created by disclosing the local hostname?

Comment 3 Eugene Kanter 2017-09-24 17:44:50 UTC

bug 702703 says:

actual host name must not be used in message-id field

it it was fixed.

for this bug the specific problem is:

actual host name must not be used in helo.

Comment 4 André Klapper 2017-09-24 19:36:09 UTC

Could you answer my question in comment 2?

Comment 5 André Klapper 2017-09-25 10:52:27 UTC

Setting NEEDINFO status until someone explains the *underlying* problem created by the current behavior.

Exposing the local IP is way more concerning to me, personally speaking.

Comment 6 Milan Crha 2017-10-09 07:15:54 UTC

[Off-topic] One thing confuses me. While this request is about not exposing local host name in Received header, which may or may not be done by the SMTP server (it depends on the SMTP server settings), which can use information from the HELO/EHLO command to populate the Received header, the reporter exposes his email address in public. I consider my email address much more private thing than my local computer name (and it's really about the name, the IP would be still there), because evil people can do evil things with the email address, while the host name can be just embarrassing to someone, though why would one choose an embarrassing local computer name at the first place...

Please, do not understand me wrong, I'm not against the change personally, I'm fine to not send the local computer name in the SMTP HELO/EHLO command, I only wanted to point to one (for me) confusing thing with this off-topic comment. There was not meant any offence too.

Comment 7 Milan Crha 2017-10-09 18:17:12 UTC

See [1], it's expected that the SMTP client uses FQDN, and only if it cannot be found, then uses IP address instead. Camel code relies on g_resolver_lookup_by_address(), but it doesn't return only public FQDN, it returns also local machine names or "internal domain" names. I do not see a way in GResolver to tell it to use only public FQDN (from public reverse DNS records), neither I know whether it's at all possible to reliably distinguish between the two.

Again, as [1] says, the SMTP client should use FQDN and only if not known it can use IP address. If Thunderbird always passes IP address, then it is not conforming to [1]. I do not use it, I cannot tell what it does, I only mention a possibility when we are comparing these two mail clients from the beginning.

[1] https://tools.ietf.org/html/rfc5321#section-4.1.1.1

Comment 8 Milan Crha 2017-10-10 06:04:30 UTC

Would there be a good solution to use the resolved name only if it contains at least two dots? It my satisfy both RFC requirement and your privacy interest, right?

Comment 9 Ángel 2017-10-12 18:56:59 UTC

On the mailing list Pete commented about spamassassin using the values of Received header in their score. However, when sending an email, the MUA is authenticated to the MSA. And if that is sent to another server, the spamassassin at that side will only evaluate the headers added by their own server, as otherwise it would be easily tricked by fake headers added by spammers (I wouldn't be surprised if there were a number of them misconfigured, though).

I'm unsure how that would be weighted by antispam systems that are filtering outbound mail, or when it goes to a local user. Maybe some of them would erroneously take into account that hostname.


In my opinion, the solution would be to have a dconf key that overrides g_resolver_lookup_by_address() Ie. if it is set, that value is used on helo, instead of using g_resolver_lookup_by_address()

In addition of this "anonymization", I guess it might be useful if there is no ptr or it is "wrong" (eg. the right value to use would be computer1.company.com but something like computer1.company.local is being returned). Although, given these are authenticated sends, the effect that such values would have on a properly configured server should be negligible.

Comment 10 Eugene Kanter 2017-10-12 23:29:11 UTC

(In reply to André Klapper from comment #4)
> Could you answer my question in comment 2?

Sorry I was away from the Internet.

> Which specific problems are created by disclosing the local hostname?

Imagine a scenario:
A person runs home business example.org from workstation evolution.example.org.
Sending mail via gmail, yahoo, aol, etc etc exposes company name to all recipients, despite the VPN which hides example.org external IP address from recipient SMTP server.

bug 702703 was resolved by using recipient domain after @ in message-id.
In my opinion, permanent solution would be to use recipient domain in HELO/EHLO response.

If for any reason my solution is rejected I would like some guidance as how to propagate recipient email into

static gboolean
smtp_helo (CamelSmtpTransport *transport,
	   CamelStreamBuffer *istream,
	   CamelStream *ostream,
           GCancellable *cancellable,
           GError **error)
method.

https://git.gnome.org/browse/evolution-data-server/tree/src/camel/providers/smtp/camel-smtp-transport.c

Comment 11 Milan Crha 2017-10-13 06:31:32 UTC

(In reply to Ángel from comment #9)
>  as otherwise it would be easily tricked by fake headers added by spammers

I see, that makes sense. In that case let's ignore the possible issue with spam detection software.

> In my opinion, the solution would be to have a dconf key that overrides
> g_resolver_lookup_by_address() Ie. if it is set, that value is used on helo,
> instead of using g_resolver_lookup_by_address()

I would like to avoid such option. As ekanter said, once there is mixed company and private business on the machine, then one value never matches both/all usages. Current SMTP searches for the local address based on the established connection, which can use VPN address when the connection went through VPN and local address when it used direct connection (that's my case here).

> In addition of this "anonymization", I guess it might be useful if there is
> no ptr or it is "wrong" (eg. the right value to use would be
> computer1.company.com but something like computer1.company.local is being
> returned).

I'm afraid of that 'it is "wrong"' part. Being able to detect it means that there's either a whitelist or a blacklist of the allowed or the disallowed domains, and I'm not much willing to maintain such list.

That's why I suggested the simple heuristic about two dots. That will avoid host names without dots, but also things like "localhost.localdomain", even it would still consider both of your examples as valid and will use them.

I also do not want to break the behaviour for others, in case the change would cause any trouble to anyone, thus there might be some option somewhere to eventually return back to the original behaviour, thus I'd suggest a three-state option, which will have this meaning:
   0 - autodetect (two dots at least)
   1 - trust g_resolver_lookup_by_address() (current behaviour)
   2 - always use local IP
with the default being '0', thus to autodetect.

(In reply to ekanter from comment #10)
> bug 702703 was resolved by using recipient domain after @ in message-id.
> In my opinion, permanent solution would be to use recipient domain in
> HELO/EHLO response.

I would not mix message ID and the HELO/EHLO command, those are two very different usages. I do not want to lie in HELO/EHLO, I do not want to help spammers by any means. (I mentioned it on the mailing list [1], I already used the headers to track down a spammer, thus such behaviour would just avoid to find him/her).

[1] Just for a reference, this is the thread I'm talking about:
https://mail.gnome.org/archives/evolution-list/2017-October/msg00044.html
It started here:
https://mail.gnome.org/archives/evolution-list/2017-September/msg00075.html
and with a breakage in headers continued here:
https://mail.gnome.org/archives/evolution-list/2017-September/msg00077.html

Comment 12 Eugene Kanter 2017-10-13 17:23:10 UTC

(In reply to Milan Crha from comment #11)
> 
> (In reply to ekanter from comment #10)
> > bug 702703 was resolved by using recipient domain after @ in message-id.
> > In my opinion, permanent solution would be to use recipient domain in
> > HELO/EHLO response.
> 
> I would not mix message ID and the HELO/EHLO command, those are two very
> different usages. I do not want to lie in HELO/EHLO, I do not want to help
> spammers by any means. (I mentioned it on the mailing list, I already
> used the headers to track down a spammer, thus such behaviour would just
> avoid to find him/her).

I politely disagree. Professional spammers most likely don't use and won't use evolution. A vast majority of MTA will not omit external IP address, thus HELO is generally irrelevant. If passing recipient domain to smtp_helo message is not feasible I would hardcode localhost.localdomain for myself until a satisfactory solution is finalized. I had maintained, although it was quite annoying, my own patch for bug 702703 for quite some time until it was resolved.

PS. is there a similar discussion among thunderbird users? if not then why do not just follow thunderbird?

Comment 13 Eugene Kanter 2017-10-13 19:03:06 UTC

(In reply to ekanter from comment #10)
> 
> bug 702703 was resolved by using recipient domain after @ in message-id.
> In my opinion, permanent solution would be to use recipient domain in
> HELO/EHLO response.
> 
> If for any reason my solution is rejected I would like some guidance as how
> to propagate recipient email into
> 
> static gboolean
> smtp_helo (CamelSmtpTransport *transport,
> 	   CamelStreamBuffer *istream,
> 	   CamelStream *ostream,
>            GCancellable *cancellable,
>            GError **error)
> method.
> 
> https://git.gnome.org/browse/evolution-data-server/tree/src/camel/providers/
> smtp/camel-smtp-transport.c

Just noticed an error. Above fragment should read sender domain in all mentioned cases, not recipient domain.
Another words I would like to see sender email domain in EHLO/HELO.

Comment 14 Milan Crha 2017-10-16 09:03:38 UTC

(In reply to ekanter from comment #12)
> PS. is there a similar discussion among thunderbird users? if not then why
> do not just follow thunderbird?

I do not know, I do not follow Thunderbird development at all. See comment #7, I'd like to stick with RFC on reasonable level.

What about the 0/1/2 option suggested in comment #11, good/bad/nonsense from your point of view? Once we settle on something satisfying both your and my requirements I'll just do it.

Comment 15 Eugene Kanter 2017-10-17 03:36:02 UTC

(In reply to Milan Crha from comment #14)
> (In reply to ekanter from comment #12)
> stick with RFC on reasonable level.
Here you go, *reasonable* level:
get the fqdn from g_resolver_lookup_by_address(), strip all after first dot and append sender email domain.
for example, if fqdn is fedora.example.com and sender's email is abc@example.net then HELO would be fedora.example.net
if there is no fqdn then preserve current behavior.
in case of a localhost.localdomain I have no suggestion and no preference.

> 
> What about the 0/1/2 option suggested in comment #11, good/bad/nonsense from
> your point of view? Once we settle on something satisfying both your and my
> requirements I'll just do it.
I don't like any of them, please see above for an alternative.

Comment 16 Milan Crha 2017-10-17 12:06:05 UTC

(In reply to ekanter from comment #15)
> get the fqdn from g_resolver_lookup_by_address(), strip all after first dot
> and append sender email domain.

The problem with this is that it's a lie and can mislead the attempt to track the machine from which the message had been sent.

> I don't like any of them, please see above for an alternative.

Pity, I have nothing better to make it possible to not expose local host name and still make it possible to return back to the previous behaviour for users where the autodetect change would cause any trouble.

Comment 17 Eugene Kanter 2017-10-17 20:17:00 UTC

(In reply to Milan Crha from comment #16)
> (In reply to ekanter from comment #15)
> > get the fqdn from g_resolver_lookup_by_address(), strip all after first dot
> > and append sender email domain.
> 
> The problem with this is that it's a lie and can mislead the attempt to
> track the machine from which the message had been sent.
> 
it is only a very small lie because host name, most important identifiable part, is not altered. domain is not important because nearest smtp server will report actual IP address.

> > I don't like any of them, please see above for an alternative.
> 
> Pity, I have nothing better to make it possible to not expose local host
> name and still make it possible to return back to the previous behavior for
> users where the autodetect change would cause any trouble.

Andre does not like 2, 1 is a current, unwanted, behavior and 0 I don't really understand but it seems the same as 1 in my case.

I would like to repeat my request to help me implement my proposed logic. I maintained my own solution for bug 702703 for a number of years until bug 702703 was closed.

Comment 18 Milan Crha 2017-10-18 07:31:22 UTC

(In reply to ekanter from comment #17)
> it is only a very small lie

It depends on the point of view. If I send a spammer's message to the administrator and he/she would not pay attention, then he/she can easily misplace the origination machine (nobody said that your local machine name cannot clash with a machine name from the email domain). And I do not like lies, thus it's not the way I would approve.

> Andre does not like 2, 1 is a current, unwanted, behavior and 0 I don't
> really understand but it seems the same as 1 in my case.

Yes, that's just about it, I cannot satisfy everyone needs, thus I want to add an option where user's would influence the behaviour based on their requirements with a way to fallback to the previous behaviour.

> >   0 - autodetect (two dots at least)

It would be the default. In case the resolved name would be without dots or with only one, then it's highly probably that it's a local machine name, in which case it would not be used and the IP would be used instead (that makes "localhost" and "localhost.localdomain" use IP, but "machine.example.com" will be used as is (it has two dots)).

> >   1 - trust g_resolver_lookup_by_address() (current behaviour)

This is the fallback.

> >   2 - always use local IP

This is for you.

> I would like to repeat my request to help me implement my proposed logic.

I'd prefer to do this at once, which should be easier for you too, at least that you can use one solution in your custom build and it'll keep working the same once you update to the changed eds version. Quick change to always use IP:
https://mail.gnome.org/archives/evolution-list/2017-September/msg00102.html

Comment 19 Ángel 2017-10-18 19:34:23 UTC

Actually, I don't think making a custom build of evolution would be needed. It looks like providing a different g_resolver_lookup_by_address through a LD_PRELOAD library would be enough to get whatever custom hostname that is desired.

Comment 20 Eugene Kanter 2017-10-19 14:35:10 UTC

(In reply to Ángel from comment #19)
> Actually, I don't think making a custom build of evolution would be needed.
> It looks like providing a different g_resolver_lookup_by_address through a
> LD_PRELOAD library would be enough to get whatever custom hostname that is
> desired.

That is somewhat more complicated to start with. I have a well defined process of building custom packages in mock so adding another one is better then starting a learning curve of building a different g_resolver_lookup_by_address.

Looks like I would be inclining to strip domain name from g_resolver_lookup_by_address output and that would completely satisfy me.

Somewhat agree with Milan that artificially appending sender email domain will look too drastic from RFE point of view.

Comment 21 Jeffrey Stedfast 2018-11-25 03:54:11 UTC

FWIW, the domain or IP address used in the HELO/EHLO need to be accurate because some SMTP servers will reject the client if it is not.

In other words, using the IP address instead of an FQDN should be fine, but you can't use a fake hostname reliably.

Comment 22 André Klapper 2021-05-19 11:03:26 UTC

GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. 
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow
  https://wiki.gnome.org/Community/GettingInTouch/BugReportingGuidelines
and create a new ticket at
  https://gitlab.gnome.org/GNOME/evolution-data-server/-/issues/

Thank you for your understanding and your help.