Rename files, folders, and their contents. With Python.

By | November 7, 2015

Ever had a hopelessly disorganized folder that doesn’t follow a specific format? Well, you can put together a script to rename all the files entirely and add a number to the end (or something to that effect), but you lose key identifying information contained in the filename e.g., “MyFirstHomeVid.avi”

So what I did was create a program that simply trims off unwanted prefixes/suffixes or “bad strings”. This approach takes a little bit more finesse and relies on enumerating things manually, but the finished product retains descriptive file names that are helpful for humans, while also approaching a universal format. With a few tweaks or modifications of your own, you can rename all your files in bulk. Note: the methods used here have only been tested with a small number of folders/files ( < 1000). For scalability, this is most certainly NOT the best implementation. Home use only! I updated the post with a paragraph at the end which links to a new library that performs with C-like speed!

To begin, we only need a few modules. Sys helps make the script command line friendly. OS contains all we need to work with pathing and renaming.

# -*- coding: utf-8- -*-
@author: transposed messenger
# Tested on Ubuntu 14.04, Windows 10

import os
import re
import sys
from os.path import join

def main(argv):

	if (len(sys.argv) != 2):
		sys.exit('Usage: /system/path/to/target/dir/for/Linux OR C:\path\to\dir\on\Windows')

This part makes it so all I need to do is open the command terminal and type “ /my/desired/directory/to/clean” – the sys.argv captures the arguments passed in when calling the code. When using sys.argv[0:x], Index[0] is where the script is launched from. [1] is the path you enter in. Like:powershell cmd line

The following OS.walk() method will rename files, folders, sub folders and their contents:

	folderpath = sys.argv[1]

	#Please initialize your own list of unwanted strings here
	bad_strings = []

	for root, subFolders, files in os.walk(folderpath):
		for folder in subFolders:
			for k in bad_strings:
				while k in folder:
					newname = join(root, re.sub(k, '', folder))
					os.rename(join(root, folder), newname)
					folder = newname
		for filename in files:
			for k in bad_strings:
				while k in filename:
					newname = join(root, re.sub(k, '', filename))
					os.rename(join(root, filename), newname)
					filename = newname

This is the basic code to recursively traverse a directory and its contents and take out anything listed in “bad_strings”.

			if filename.endswith('.txt') or filename.startswith('Sample') or filename.endswith('.jpg') or filename.endswith('.png'):
					full = join(root, filename)

To deal with special characters, brackets, parenthesis, and so on, a different method is used because of a system error I kept getting:

	for root, subFolders, files in os.walk(folderpath):
		for folder in subFolders:
			newfolder = folder
			newfolder = newfolder.replace('-', '')
			newfolder = newfolder.replace(' ', '.')
			newfolder = newfolder.replace('...', '.')
			newfolder = newfolder.replace('..', '.')
			newfolder = newfolder.replace('[]', '.')
			newfolder = newfolder.replace('()', '.')
			newfolder = newfolder.replace('{}', '.')
			tempFolder = join(root,newfolder)
			os.rename(join(root, folder), tempFolder)
			folder = tempFolder
		for filename in files:
			newname = filename
			newname = newname.replace('-', '')
			newname = newname.replace(' ', '.')
			newname = newname.replace('...', '.')
			newname = newname.replace('..', '.')
			newname = newname.replace('[]', '.')
			newname = newname.replace('()', '.')
			newname = newname.replace('{}', '.')
			tempName = join(root,newname)
			os.rename(join(root, filename), tempName)
			filename = tempName

The end result achieves something like this:rename example

The folder and filename renaming met a namespace error when removing these special characters, or IIRC, the last character in a filename, hence this clunky sequence of replacing single characters. Some things could be done in less lines, but I just went with whatever wasn’t getting errors. In hindsight, this bit of Python official documentation may explain one way I was getting errors:

When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting, or even to inform walk() about directories the caller creates or renames before it resumes walk() again. Modifying dirnames when topdown is False has no effect on the behavior of the walk, because in bottom-up mode the directories in dirnames are generated before dirpath itself is generated.

The OS.walk method is a powerful module, as well as system agnostic. I was able to run the code on Ubuntu and Windows 10 and get the same results. The only thing I left out of the code here was a hardcoded list of bad strings, e.g., “” placed in bad_strings would remove all instances of that URL from file names in the specified directory. I may add this later or try to compile some sort of super list (mine already has around 75 strings) and then post it up. For now, you can add in your own specifications to the source code so it looks like:


Not shown is the line that goes “if ___.endswith(___) or ___.startswith(___)” where you can specify what kinds of files you would like to be deleted. For this example, I put “filename.startswith(‘BONUS’) as one of the conditionals. To complete this tool, I would want to make this part and the bad_strings variable configurable via a GUI that saves a configuration file for more intuitive use. However, there already exists a tool available on most platforms called Filebot. I just tested this program and it is essential. It is geared towards renaming music, TV shows, and movies so you lose some flexibility compared to created your own Python script. However, if you are interested in managing these types of files (as many people are), then Filebot does a fine job. You can create macros that rename to a definable format. It can also move the files to a new directory. It has a command line interface. It can automagically acquire srt files and cover art. Filebot does all this by pulling data from various online databases. You can even undo a rename batch if you realize you don’t like it or it is wrong.

If your case is large scale…

Enter os.scandir(). I came across this post the other day and came back to update my post because of it. If you’re tasked with having to traverse thousands of directories and rename hundreds of thousands of files, and your solution needs to run quickly – os.scandir() is for you. The creator took os.walk() and optimized it, that not only made the run time of certain tasks orders of magnitude faster, but also redesigned it so the modules were far more. . . modular. Heh. Basically it talks to the kernel much more efficiently instead of requiring you to make multiple calls to lower level systems.

One thought on “Rename files, folders, and their contents. With Python.

  1. Pingback: Can the Raspberry Pi 2 be a HTPC? | Learn | Imagine | Innovate

Leave a Reply

Your email address will not be published.