Warning: os.path.join surprising behaviour

>>> os.path.join("/a/b/c", "/d/e/f")
"/d/e/f"

This can be a problem if “/d/e/f” comes from an untrusted source.

As a safety measure, avoid using os.path.join in your web applications, roll your own and call it “safe_join”. You will sleep better.

7 Responses to “Warning: os.path.join surprising behaviour”

  1. afoo writes:

    Hm IMHO, this is a bug and should be reported. What do you think?

  2. afoo writes:

    Ah, nevermind, there already are http://sourceforge.net/tracker/index.php?func=detail&aid=1209447&group_id=5470&atid=105470 and http://sourceforge.net/tracker/index.php?func=detail&aid=1688564&group_id=5470&atid=105470

  3. Fuzzyman writes:

    How is that surprising? It is the documented behaviour, and usually desirable. The second path in your example is rooted.

  4. Fuzzyman writes:

    Oh - and you can stop it by detecting/removing the leading “/”.

  5. dtlin writes:

    >>> os.path.join(”/a/b/c”, *”/d/e/f”.split(’/'))
    ‘/a/b/c/d/e/f’

    It is a bit surprising.

  6. Dan writes:

    It surprises me. But os.path.join is inherently unsafe: ‘../../../d/e/f’ would have the same effect as ‘/d/e/f’ here. Don’t let untrusted sources supply paths!

  7. Chui writes:

    Fuzzyman,

    You are right that it’s documented. It doesn’t stop the behavior from being surprising.

    About stripping leading “/”, you also need to strip the drive name on Windows platforms.

    A better guard is to assert that the joined path is a subdirectory of the base.

    e.g.
    newpath = os.path.join(base, subpath)
    assert newpath.find(base) == 0

    Perhaps it should be annotated as being unsafe on web platforms.

    Chui

Leave a Reply