Removing the first folder in a path

Question or problem about Python programming:

I have a path which looks like

/First/Second/Third/Fourth/Fifth

and I would like to remove the First from it, thus obtaining

Second/Third/Fourth/Fifth

The only idea I could come up with is to use recursively os.path.split but this does not seem optimal. Is there a better solution?

How to solve the problem:

Solution 1:

There really is nothing in the os.path module to do this. Every so often, someone suggests creating a splitall function that returns a list (or iterator) of all of the components, but it never gained enough traction.

Partly this is because every time anyone ever suggested adding new functionality to os.path, it re-ignited the long-standing dissatisfaction with the general design of the library, leading to someone proposing a new, more OO-like, API for paths to deprecated the os, clunky API. In 3.4, that finally happened, with pathlib. And it’s already got functionality that wasn’t in os.path. So:

>>> import pathlib
>>> p = pathlib.Path('/First/Second/Third/Fourth/Fifth')
>>> p.parts[2:]
('Second', 'Third', 'Fourth', 'Fifth')
>>> pathlib.Path(*p.parts[2:])
PosixPath('Second/Third/Fourth/Fifth')

Or… are you sure you really want to remove the first component, rather than do this?

>>> p.relative_to(*p.parts[:2])
PosixPath('Second/Third/Fourth/Fifth')

If you need to do this in 2.6-2.7 or 3.2-3.3, there’s a backport of pathlib.

Of course, you can use string manipulation, as long as you’re careful to normalize the path and use os.path.sep, and to make sure you handle the fiddly details with non-absolute paths or with systems with drive letters, and…

Or you can just wrap up your recursive os.path.split. What exactly is “non-optimal” about it, once you wrap it up? It may be a bit slower, but we’re talking nanoseconds here, many orders of magnitude faster than even calling stat on a file. It will have recursion-depth problems if you have a filesystem that’s 1000 directories deep, but have you ever seen one? (If so, you can always turn it into a loop…) It takes a few minutes to wrap it up and write good unit tests, but that’s something you just do once and never worry about again. So, honestly, if you don’t want to use pathlib, that’s what I’d do.

Solution 2:

A bit like another answer, taking advantage of os.path :

os.path.join(*(x.split(os.path.sep)[2:]))

… assuming your string starts with a separator.

Solution 3:

A simple approach

a = '/First/Second/Third/Fourth/Fifth'
"/".join(a.strip("/").split('/')[1:])

output:

Second/Third/Fourth/Fifth

In this above code i have split the string. then joined leaving 1st element

Using itertools.dropwhile:

>>> a = '/First/Second/Third/Fourth/Fifth'
>>> "".join(list(itertools.dropwhile(str.isalnum, a.strip("/"))[1:])
'Second/Third/Fourth/Fifth'

Solution 4:

You can try:

os.path.relpath(your_path, '/First')

Solution 5:

I was looking if there was a native way to do it, but it seems it doesn’t.

I know this topic is old, but this is what I did to get me to the best solution:
There was two basically two approaches: using split() and using len(). Both had to use slicing.

1) Using split()

import time

start_time = time.time()

path = "/folder1/folder2/folder3/file.zip"
for i in xrange(500000):
    new_path = "/" + "/".join(path.split("/")[2:])

print("--- %s seconds ---" % (time.time() - start_time))

Result: — 0.420122861862 seconds —

*Removing the char “/” in the line new_path = “/” + “/”…. didn’t improve the performance too much.

2) Using len(). This method will only work if you provide the folder if you would like to remove

import time

start_time = time.time()

path = "/folder1/folder2/folder3/file.zip"
folder = "/folder1"
for i in xrange(500000):
    if path.startswith(folder):
        a = path[len(folder):]

print("--- %s seconds ---" % (time.time() - start_time))

Result: — 0.199596166611 seconds —

*Even with that “if” to check if the path starts with the file name, it was twice as fast as the first method.

In summary: each method has a pro and con. If you are absolutely sure about the folder you want to remove use method two, otherwise I recommend to use method 1 which people here have mentioned previously.

Question or problem about Python programming:

How to solve the problem:

Solution 1:

Solution 2:

Solution 3:

Solution 4:

Solution 5:

Hope this helps!

Related Posts

Why is ‘x’ in (‘x’,) faster than ‘x’ == ‘x’?

How do I get the path of the current executed file in Python?

Connection Timeout with Elasticsearch