Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
113 changes: 113 additions & 0 deletions Lib/saslprep.py
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this as Lib/_saslprep.py as we aren't offering this up as a public documented stdlib module and API.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can imagine this being useful outside of smtplib. I wrote it originally for PyMongo. I would have loved to be able to just import it instead. :-)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a new stdlib module shouldn't be taken lightly.
Would it make sense to add the function to smtplib.py directly?

If you want this to be public documented API, then it should have documentation :)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's useful in imaplib, at least, which also uses SASL. I'd love to refactor the SASL use im imaplib and smtplib, but that's outside the scope of this PR. That refactoring should probably make saslprep public, if it wasn't before? I have no opinion on whether saslprep should be public without that refactoring.

(Many thanks for all the other comments; I'll get back to this tomorrow. Meetings and another PR grabbed the mutex on my day today and kept it until now.)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with it living in its own file (which may be considered "required" for for license text reasons), but if this is to be a public API, I'd prefer to see the function exposed publicly from within an existing stdlib module, not a new top level module name.

It is a bit of a utility function. Another plausible stdlib module it could make sense to be exposed publicly from would be hashlib, given its purpose.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be allowed to move the licence elsewhere? In fact we should probably add it to the long list of small licences...
After the code becomes part of Python, it will be improved, refactored and moved around. If that isn't allowed, we should IMO keep it as an external dependency.

Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# Copyright 2016-present MongoDB, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""An implementation of RFC4013 SASLprep."""
from typing import Any, Optional

try:
import stringprep
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stringprep is available in all supported versions of python.
This check should be removed.

except ImportError:
HAVE_STRINGPREP = False

def saslprep(data: Any, prohibit_unassigned_code_points: Optional[bool] = True) -> str:
"""SASLprep dummy"""
if isinstance(data, str):
raise TypeError(
"The stringprep module is not available. Usernames and "
"passwords must be instances of bytes."
)
return data

else:
HAVE_STRINGPREP = True
import unicodedata

# RFC4013 section 2.3 prohibited output.
_PROHIBITED = (
# A strict reading of RFC 4013 requires table c12 here, but
# characters from it are mapped to SPACE in the Map step. Can
# normalization reintroduce them somehow?
stringprep.in_table_c12,
stringprep.in_table_c21_c22,
stringprep.in_table_c3,
stringprep.in_table_c4,
stringprep.in_table_c5,
stringprep.in_table_c6,
stringprep.in_table_c7,
stringprep.in_table_c8,
stringprep.in_table_c9,
)

def saslprep(data: Any, prohibit_unassigned_code_points: Optional[bool] = True) -> str:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Any annotation does nothing. According to the docstring, it should be bytes|str. (But you can drop it altogether; we don't run typecheckers for the stdlib.)
Why is the bool Optional? I don't see a reason for allowing None here.

"""An implementation of RFC4013 SASLprep.

:Parameters:
- `data`: The string to SASLprep. Unicode strings
(:class:`str`) are supported. Byte strings
(:class:`bytes`) are ignored.
- `prohibit_unassigned_code_points`: True / False. RFC 3454
and RFCs for various SASL mechanisms distinguish between
`queries` (unassigned code points allowed) and
`stored strings` (unassigned code points prohibited). Defaults
Comment on lines +59 to +62
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, for the argument name it would be better to refer to the use case (something like store=True/False or intent='query'/'store'), and leave the effect as more of an implementation detail.
How was the default chosen? It seems that if you want False but forget it, you'll get subtly wrong (and possibly dangerous?) behaviour. If that's the case, there should be no default; users should always specify this.

to ``True`` (unassigned code points are prohibited).

:Returns:
The SASLprep'ed version of `data`.
"""
prohibited: Any
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does nothing.

Suggested change
prohibited: Any

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be there to silence a type checker.

Type checkers in Python consider tuples structures, not immutable lists. Therefore concatenating tuples changes their type.

Instead of removing the line, I'd provide a better annotation.

Something like:

Suggested change
prohibited: Any
prohibited: tuple[Callable[[str], bool], ...]


if not isinstance(data, str):
return data

if prohibit_unassigned_code_points:
prohibited = _PROHIBITED + (stringprep.in_table_a1,)
else:
prohibited = _PROHIBITED

# RFC3454 section 2, step 1 - Map
# RFC4013 section 2.1 mappings
# Map Non-ASCII space characters to SPACE (U+0020). Map
# commonly mapped to nothing characters to, well, nothing.
in_table_c12 = stringprep.in_table_c12
in_table_b1 = stringprep.in_table_b1
data = "".join(
["\u0020" if in_table_c12(elt) else elt for elt in data if not in_table_b1(elt)]
)

# RFC3454 section 2, step 2 - Normalize
# RFC4013 section 2.2 normalization
data = unicodedata.ucd_3_2_0.normalize("NFKC", data)

in_table_d1 = stringprep.in_table_d1
if in_table_d1(data[0]):
if not in_table_d1(data[-1]):
# RFC3454, Section 6, #3. If a string contains any
# RandALCat character, the first and last characters
# MUST be RandALCat characters.
raise ValueError("SASLprep: failed bidirectional check")
# RFC3454, Section 6, #2. If a string contains any RandALCat
# character, it MUST NOT contain any LCat character.
prohibited = prohibited + (stringprep.in_table_d2,)
else:
# RFC3454, Section 6, #3. Following the logic of #3, if
# the first character is not a RandALCat, no other character
# can be either.
prohibited = prohibited + (in_table_d1,)

# RFC3454 section 2, step 3 and 4 - Prohibit and check bidi
for char in data:
if any(in_table(char) for in_table in prohibited):
raise ValueError("SASLprep: failed prohibited character check")

return data
15 changes: 8 additions & 7 deletions Lib/smtplib.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@
import datetime
import sys
from email.base64mime import body_encode as encode_base64
from saslprep import saslprep

__all__ = ["SMTPException", "SMTPNotSupportedError", "SMTPServerDisconnected", "SMTPResponseException",
"SMTPSenderRefused", "SMTPRecipientsRefused", "SMTPDataError",
Expand Down Expand Up @@ -638,7 +639,7 @@ def auth(self, mechanism, authobject, *, initial_response_ok=True):
mechanism = mechanism.upper()
initial_response = (authobject() if initial_response_ok else None)
if initial_response is not None:
response = encode_base64(initial_response.encode('ascii'), eol='')
response = encode_base64(initial_response.encode('utf-8'), eol='')
(code, resp) = self.docmd("AUTH", mechanism + " " + response)
self._auth_challenge_count = 1
else:
Expand All @@ -649,7 +650,7 @@ def auth(self, mechanism, authobject, *, initial_response_ok=True):
self._auth_challenge_count += 1
challenge = base64.decodebytes(resp)
response = encode_base64(
authobject(challenge).encode('ascii'), eol='')
authobject(challenge).encode('utf-8'), eol='')
(code, resp) = self.docmd(response)
# If server keeps sending challenges, something is wrong.
if self._auth_challenge_count > _MAXCHALLENGE:
Expand All @@ -667,21 +668,21 @@ def auth_cram_md5(self, challenge=None):
# CRAM-MD5 does not support initial-response.
if challenge is None:
return None
return self.user + " " + hmac.HMAC(
self.password.encode('ascii'), challenge, 'md5').hexdigest()
return saslprep(self.user) + " " + hmac.HMAC(
saslprep(self.password).encode('utf-8'), challenge, 'md5').hexdigest()

def auth_plain(self, challenge=None):
""" Authobject to use with PLAIN authentication. Requires self.user and
self.password to be set."""
return "\0%s\0%s" % (self.user, self.password)
return "\0%s\0%s" % (saslprep(self.user), saslprep(self.password))

def auth_login(self, challenge=None):
""" Authobject to use with LOGIN authentication. Requires self.user and
self.password to be set."""
if challenge is None or self._auth_challenge_count < 2:
return self.user
return saslprep(self.user)
else:
return self.password
return saslprep(self.password)

def login(self, user, password, *, initial_response_ok=True):
"""Log in on an SMTP server that requires authentication.
Expand Down
39 changes: 39 additions & 0 deletions Lib/test/test_saslprep.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Copyright 2016-present MongoDB, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import sys
import unittest
from saslprep import saslprep


class TestSASLprep(unittest.TestCase):
def test_saslprep(self):
try:
import stringprep # noqa
except ImportError:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @barry-scott said above, stringprep is always available in 3.13+.

self.assertRaises(TypeError, saslprep, "anything...")
# Bytes strings are ignored.
self.assertEqual(saslprep(b"user"), b"user")
else:
# Examples from RFC4013, Section 3.
self.assertEqual(saslprep("I\u00ADX"), "IX")
self.assertEqual(saslprep("user"), "user")
self.assertEqual(saslprep("USER"), "USER")
self.assertEqual(saslprep("\u00AA"), "a")
self.assertEqual(saslprep("\u2168"), "IX")
self.assertRaises(ValueError, saslprep, "\u0007")
self.assertRaises(ValueError, saslprep, "\u0627\u0031")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add the test cases from mongodb's JS library?

# Bytes strings are ignored.
self.assertEqual(saslprep(b"user"), b"user")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's test that with some non-UTF8 bytes?

Suggested change
self.assertEqual(saslprep(b"user"), b"user")
self.assertEqual(saslprep(b"user"), b"user")
self.assertEqual(saslprep(b"\0\xff"), b"\0\xff")

68 changes: 40 additions & 28 deletions Lib/test/test_smtplib.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import email.mime.text
from email.message import EmailMessage
from email.base64mime import body_encode as encode_base64
from saslprep import saslprep
import email.utils
import hashlib
import hmac
Expand Down Expand Up @@ -808,6 +809,10 @@ def testLineTooLong(self):
}

sim_auth = ('Mr.A@somewhere.com', 'somepassword')
sim_auths = {
'Mr.A@somewhere.com' : 'somepassword',
'भारत@भारत' : 'भारत@',
'Mr.C@somewhere.com' : 'IX'}
sim_cram_md5_challenge = ('PENCeUxFREJoU0NnbmhNWitOMjNGNn'
'dAZWx3b29kLmlubm9zb2Z0LmNvbT4=')
sim_lists = {'list-1':['Mr.A@somewhere.com','Mrs.C@somewhereesle.com'],
Expand Down Expand Up @@ -895,7 +900,7 @@ def _auth_plain(self, arg=None):
self.push('535 Splitting response {!r} into user and password'
' failed: {}'.format(logpass, e))
return
self._authenticated(user, password == sim_auth[1])
self._authenticated(user, saslprep(password) == sim_auths[user])

def _auth_login(self, arg=None):
if arg is None:
Expand All @@ -907,7 +912,8 @@ def _auth_login(self, arg=None):
self.push('334 UGFzc3dvcmQ6')
else:
password = self._decode_base64(arg)
self._authenticated(self._auth_login_user, password == sim_auth[1])
self._authenticated(self._auth_login_user,
saslprep(password) == sim_auths[self._auth_login_user])
del self._auth_login_user

def _auth_buggy(self, arg=None):
Expand All @@ -927,8 +933,8 @@ def _auth_cram_md5(self, arg=None):
'failed: {}'.format(logpass, e))
return False
valid_hashed_pass = hmac.HMAC(
sim_auth[1].encode('ascii'),
self._decode_base64(sim_cram_md5_challenge).encode('ascii'),
saslprep(sim_auths[user].encode('utf-8')),
self._decode_base64(sim_cram_md5_challenge).encode('utf-8'),
'md5').hexdigest()
self._authenticated(user, hashed_pass == valid_hashed_pass)
# end AUTH related stuff.
Expand Down Expand Up @@ -1117,30 +1123,41 @@ def testEXPN(self):
self.assertEqual(smtp.expn(u), expected_unknown)
smtp.quit()

def helpAUTH_x(self, feature):
self.serv.add_feature(feature)
for username in sim_auths.keys():
with self.subTest(username=username):
smtp = smtplib.SMTP(HOST, self.port, local_hostname='localhost', timeout=15)
resp = smtp.login(username, sim_auths[username])
self.assertEqual(resp, (235, b'Authentication Succeeded'))
smtp.close()
with self.subTest(username=username):
smtp = smtplib.SMTP(HOST, self.port, local_hostname='localhost', timeout=15)
with self.assertRaises(smtplib.SMTPAuthenticationError):
smtp.login(username, "No" + sim_auths[username])
smtp.close()
with self.subTest(username=username):
smtp = smtplib.SMTP(HOST, self.port, local_hostname='localhost', timeout=15)
resp = smtp.login(username, sim_auths[username] + u"\u00AD")
self.assertEqual(resp, (235, b'Authentication Succeeded'))
smtp.close()

def testAUTH_PLAIN(self):
self.serv.add_feature("AUTH PLAIN")
smtp = smtplib.SMTP(HOST, self.port, local_hostname='localhost',
timeout=support.LOOPBACK_TIMEOUT)
resp = smtp.login(sim_auth[0], sim_auth[1])
self.assertEqual(resp, (235, b'Authentication Succeeded'))
smtp.close()
self.helpAUTH_x("AUTH PLAIN")

def testAUTH_LOGIN(self):
self.serv.add_feature("AUTH LOGIN")
smtp = smtplib.SMTP(HOST, self.port, local_hostname='localhost',
timeout=support.LOOPBACK_TIMEOUT)
resp = smtp.login(sim_auth[0], sim_auth[1])
self.assertEqual(resp, (235, b'Authentication Succeeded'))
smtp.close()
self.helpAUTH_x("AUTH LOGIN")

def testAUTH_LOGIN_initial_response_ok(self):
self.serv.add_feature("AUTH LOGIN")
with smtplib.SMTP(HOST, self.port, local_hostname='localhost',
timeout=support.LOOPBACK_TIMEOUT) as smtp:
smtp.user, smtp.password = sim_auth
smtp.ehlo("test_auth_login")
resp = smtp.auth("LOGIN", smtp.auth_login, initial_response_ok=True)
self.assertEqual(resp, (235, b'Authentication Succeeded'))
for username in sim_auths.keys():
with smtplib.SMTP(HOST, self.port, local_hostname='localhost',
timeout=support.LOOPBACK_TIMEOUT) as smtp:
smtp.user = username
smtp.password = sim_auths[username]
smtp.ehlo("test_auth_login")
resp = smtp.auth("LOGIN", smtp.auth_login, initial_response_ok=True)
self.assertEqual(resp, (235, b'Authentication Succeeded'))

def testAUTH_LOGIN_initial_response_notok(self):
self.serv.add_feature("AUTH LOGIN")
Expand Down Expand Up @@ -1183,12 +1200,7 @@ def testAUTH_CRAM_MD5(self):
@hashlib_helper.requires_hashdigest('md5', openssl=True)
def testAUTH_multiple(self):
# Test that multiple authentication methods are tried.
self.serv.add_feature("AUTH BOGUS PLAIN LOGIN CRAM-MD5")
smtp = smtplib.SMTP(HOST, self.port, local_hostname='localhost',
timeout=support.LOOPBACK_TIMEOUT)
resp = smtp.login(sim_auth[0], sim_auth[1])
self.assertEqual(resp, (235, b'Authentication Succeeded'))
smtp.close()
self.helpAUTH_x("AUTH BOGUS PLAIN LOGIN CRAM-MD5")

def test_auth_function(self):
supported = {'PLAIN', 'LOGIN'}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Added support for unicode logins/passwords for SMTP authentication.

This includes SASLprep support, graciously contributed by MongoDB.com.
1 change: 1 addition & 0 deletions Python/stdlib_module_names.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.