How to extract a filename from a URL & append a word to it?

Python Programming

Question or problem about Python programming:

I have the following url:

url =

I would like to extract the file name in this url: 09-09-201315-47-571378756077.jpg

Once I get this file name, I’m going to save it with this name to the Desktop.

filename = **extracted file name from the url**     
download_photo = urllib.urlretrieve(url, "/home/ubuntu/Desktop/%s.jpg" % (filename))

After this, I’m going to resize the photo, once that is done, I’ve going to save the resized version and append the word “_small” to the end of the filename.

downloadedphoto ="/home/ubuntu/Desktop/%s.jpg" % (filename))               
resize_downloadedphoto = downloadedphoto.resize.((300, 300), Image.ANTIALIAS)"/home/ubuntu/Desktop/%s.jpg" % (filename + _small))

From this, what I am trying to achieve is to get two files, the original photo with the original name, then the resized photo with the modified name. Like so:



How can I go about doing this?

How to solve the problem:

Solution 1:

You can use urllib.parse.urlparse with os.path.basename:

import os
from urllib.parse import urlparse

url = ""
a = urlparse(url)
print(a.path)                    # Output: /kyle/09-09-201315-47-571378756077.jpg
print(os.path.basename(a.path))  # Output: 09-09-201315-47-571378756077.jpg

Solution 2:


Why try harder?

In [1]: os.path.basename("")
Out[1]: 'file.html'

In [2]: os.path.basename("")
Out[2]: 'file'

In [3]: os.path.basename("")
Out[3]: ''

In [4]: os.path.basename("")
Out[4]: ''

Note 2020-12-20

Nobody has thus far provided a complete solution.

A URL can contain a ?[query-string] and/or a #[fragment Identifier] (but only in that order: ref)

In [1]: from os import path

In [2]: def get_filename(url):
   ...:     fragment_removed = url.split("#")[0]  # keep to left of first #
   ...:     query_string_removed = fragment_removed.split("?")[0]
   ...:     scheme_removed = query_string_removed.split("://")[-1].split(":")[-1]
   ...:     if scheme_removed.find("/") == -1:
   ...:         return ""
   ...:     return path.basename(scheme_removed)

In [3]: get_filename("")
Out[3]: 'b'

In [4]: get_filename("")
Out[4]: ''

In [5]: get_filename("")
Out[5]: ''

In [6]: get_filename("")
Out[6]: 'b'

In [7]: get_filename("")
Out[7]: 'b'

Solution 3:

filename = url[url.rfind("/")+1:]
filename_small = filename.replace(".", "_small.")

maybe use “.jpg” in the last case since a . can also be in the filename.

Solution 4:

You could just split the url by “/” and retrieve the last member of the list:

    url = ""
    filename = url.split("/")[-1] 

Then use replace to change the ending:

    small_jpg = filename.replace(".jpg", "_small.jpg")

Solution 5:

Use urllib.parse.urlparse to get just the path part of the URL, and then use pathlib.Path on that path to get the filename:

from urllib.parse import urlparse
from pathlib import Path

url = ""
a = urlparse(url)
a.path             # '/some/long/path/a_filename.jpg'
Path(a.path).name  # 'a_filename.jpg'

Solution 6:

Sometimes there is a query string:

filename = url.split("/")[-1].split("?")[0] 
new_filename = filename.replace(".jpg", "_small.jpg")

Solution 7:

Python split url to find image name and extension

helps you to extract the image name. to append name :

imageName =  '09-09-201315-47-571378756077'

new_name = '{0}_small.jpg'.format(imageName) 

Solution 8:

We can extract filename from a url by using ntpath module.

import ntpath
url = ''
name, ext = ntpath.splitext(ntpath.basename(url))
# 09-09-201315-47-571378756077  .jpg

print(name + '_small' + ext)

Solution 9:

With python3 (from 3.4 upwards) you can abuse the pathlib library in the following way:

from pathlib import Path

p = Path('')
# >>> 'somefile.html'

# >>> 'somefile'

# >>> '.html'

# >>> 'somefile-spamspam.html'

Hope this helps!