gh-118716: Do not enforce ASCII to format address#118717
gh-118716: Do not enforce ASCII to format address#118717cedk wants to merge 2 commits intopython:mainfrom
Conversation
|
You need to update this test: cpython/Lib/test/test_email/test_email.py Lines 3241 to 3245 in 26bab42 |
|
The commit that added this test pre-dates that rfc: 8debacb. |
|
IMO the legacy API (which is what you are modifying) should not support utf-8 addresses, since it provides no mechanisms for correctly interoperating with smtplib the way the new API does. Therefore I recommend rejecting this PR. See my comments on the issue for more. |
|
Could you add a news entry? https://blurb-it.herokuapp.com/add_blurb |
| """ | ||
| name, address = pair | ||
| # The address MUST (per RFC) be ascii, so raise a UnicodeError if it isn't. | ||
| address.encode('ascii') |
There was a problem hiding this comment.
I object to this change. I don't think the legacy API should be modified.
| self.assertEqual(local_part.local_part, r'\example\\ example') | ||
|
|
||
| def test_get_local_part_unicode_defect(self): | ||
| def test_get_local_part_unicode(self): |
There was a problem hiding this comment.
This is not, IMO, the best solution. But fixing it right is non-trivial. The issue here is that this IS a defect if policy.utf8=False, and that should be maintained. The problem is that currently the header value parser does not have access to the policy. Which was a design mistake.
The reason you want to maintain the old behavior (aside from backward compatibility, which definitely counts) is that the parser is designed to be used in multiple scenarios, and one of those scenarios might be a place were we want to be emulating a non-SMTPUTF8 parser and looking at where a parsed message does not pass muster.
Now, that said, I'm not seeing any good way to fix this right. I think the best compromise might be to leave this defect alone, but to change the Address object so that it specifically ignores this defect (does not raise an error in that case). The smtp library should be raising an error if an attempt is made to actually send using a utf8 address to a server that does not support SMTPUTF8. With this compromise backward compatibility will be maintained, while still allowing utf8 addresses to be specified using Address (which should only be used by library consumers, not the library itself). I believe I am correct that such addresses will not otherwise be rejected by the email library.
This leaves a dangling issue that if we parse a message using utf8=True we technically should not get a defect, but unless and until we teach the header value parser about the policy we can't really fix that.
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
…r_spec RFC 6530 allows UTF-8 strings in the local part.
ac72c64 to
2bbb200
Compare
|
I have made the requested changes; please review again |
|
Thanks for making the requested changes! @bitdancer: please review the changes made to this pull request. |
|
This PR is stale because it has been open for 30 days with no activity. |
RFC 6530 allows UTF-8 strings