Recently I had to batch convert a bunch of raw image files to JPEG. I had a folder given to me and this folder had a lot of image files, some were in JPEG format, and some other were in NEF format. The NEF files are a kind of raw image files taken right from some digital camera. Windows 7’s built-in file browser could not display the images for NEF files, but Photoshop could. They were large in file size compared to JPEG files. My task was to convert all NEF files to JPEG and discard the NEF files to save disk space.
I thought to write a Python script to do the job and googled around but there seemed to be no module for conversion. PIL is a popular image library for Python and it doesn’t support reading NEF files.
So I thought maybe what if I could automate Photoshop? Or what if Photoshop had some kind of API? So I googled around further and found that Photoshop has a built-in batch conversion feature. You can access the feature by clicking File > Scripts > Image Processor… menu from Photoshop. See Getting To Know Photoshop: Image Processor
The images folder I had to work with had many subfolders. Image files in some subfolders were all JPEG, but some subfolders only had NEF files in them, and then there were subfolders which had both JPEG files and NEF files. I made a simplified test folder called pic-folder which has two subfolders and six image files in it. The file list of pic-folder:
pic-folder\10\A.jpg pic-folder\10\A.NEF pic-folder\10\B.jpg pic-folder\10\B.NEF pic-folder\20\C.NEF pic-folder\20\D.NEF
The subfolder 10 represents a subfolder where somebody else has already converted the NEF files to JPEG but not discarded the NEF files. The subfolder 20 represents a subfolder where conversion is not done.
Let’s see what Photoshop can do to that folder. Open Image Processor from Photoshop. In “1. Select the images to process” section, select pic-folder, and check “Include All sub-folders”. In “2. Select location to save processed images” section, select a different folder (an empty folder), say pic-folder-dest, and check “Keep folder structure”. In “3. File Type” section, make sure the option “Save as JPEG” is checked, which is by default checked. In “4. Preferences” section, make sure that “Run Action” is unchecked. I don’t know what the option “Include ICC Profile” does, but it’s checked by default, so let’s leave it checked. Now click “Run”. Photoshop will open each image file in subfolders of pic-folder and save as JPEG in appropriate subfolders of pic-folder-dest. After Photoshop finishes its job, the file list of pic-folder-dest should be like this:
pic-folder-dest\10\A.jpg pic-folder-dest\10\A_1.jpg pic-folder-dest\10\B.jpg pic-folder-dest\10\B_1.jpg pic-folder-dest\20\C.jpg pic-folder-dest\20\D.jpg
Photoshop processed all six files so we got six JPEG files in pic-folder-dest. It didn’t skip JPEG files. It first processed pic-folder\10\A.jpg, which is already in JPEG format, and saved the result as pic-folder-dest\10\A.jpg, and then it processed pic-folder\10\A.NEF, and saved the result as pic-folder-dest\10\A_1.jpg because the name pic-folder-dest\10\A.jpg was occupied by then, and so on. That’s a problem.
How to skip jpg files, and also skip NEF files which already have corresponding jpg files? Photoshop has scripting feature. Maybe there is a way to customize the job of Image Processor further by scripting, but I had no time to learn how to script Photoshop. The folder I was given had to be processed within few days. So what I did was write and run a pre-processing Python script which takes out all NEF files without corresponding jpg files to a separate folder, then run Image Processor on that separate folder, and finish with a post-processing Python script that takes output jpg files and put them into the original folder.
Part of pre-processing script:
import shutil, os, errno # http://stackoverflow.com/questions/273192/python-best-way-to-create-directory-if-it-doesnt-exist-for-file-write def ensuredirs(path): try: os.makedirs(path) except OSError as exc: if exc.errno == errno.EEXIST: pass else: raise def move_file(src, dst, dryrun=False): if dryrun: print 'os.rename(A, B)' print 'A:', src print 'B:', dst else: ensuredirs(os.path.dirname(dst)) os.rename(src, dst) JPG_EXTS = ['.JPG', '.JPEG', '.jpg', '.jpeg'] def is_nef(fn): return fn.endswith('.nef') or fn.endswith('.NEF') def is_jpg(fn): root, ext = os.path.splitext(fn) return ext in JPG_EXTS def take_nefs_out(src_dir, dest_dir, dryrun=False): """Take out NEF files in src_dir with no accompanying JPEG files, create folder dest_dir, and move the files to dest_dir, preserving the folder structure. """ assert os.path.isdir(src_dir) assert not os.path.exists(dest_dir) for p, dirs, files in os.walk(src_dir): for fn in files: if is_nef(fn): root, ext = os.path.splitext(fn) if any(root + jpg_ext in files for jpg_ext in JPG_EXTS): continue fullpath = os.path.join(p, fn) newfullpath = fullpath.replace(src_dir, dest_dir, 1) move_file(fullpath, newfullpath, dryrun)
Try a dry run on the test folder.
os.chdir(parent_folder_of_pic_folder) take_nefs_out("pic-folder", "nef-folder", dryrun=True)
os.rename(A, B) A: pic-folder\20\C.NEF B: nef-folder\20\C.NEF os.rename(A, B) A: pic-folder\20\D.NEF B: nef-folder\20\D.NEF
OK, it’s selecting the right NEF files and it seems they’ll move to right places. Run take_nefs_out.
os.chdir(parent_folder_of_pic_folder) take_nefs_out("pic-folder", "nef-folder")
After that, the file list for pic-folder should be:
pic-folder\10\A.jpg pic-folder\10\A.NEF pic-folder\10\B.jpg pic-folder\10\B.NEF
and the file list for nef-folder should be:
Run Image Processor on nef-folder to create JPEG files in jpg-folder. Then the file list for jpg-folder should be:
Finally, we need to move JPEG files in jpg-folder to pic-folder and remove NEF files.
Part of post-processing script:
def move_jpgs(src_dir, dest_dir, dryrun=False): assert all(os.path.isdir(d) for d in [dest_dir, src_dir]) for p, dirs, files in os.walk(src_dir): for fn in files: assert is_jpg(fn) src_path = os.path.join(p,fn) dst_path = src_path.replace(src_dir,dest_dir, 1) move_file(src_path, dst_path, dryrun) move_jpgs("jpg-folder", "pic-folder", dryrun=True)
After running move_jpgs(“jpg-folder”, “pic-folder”), the file list of pic-folder should be:
pic-folder\10\A.jpg pic-folder\10\A.NEF pic-folder\10\B.jpg pic-folder\10\B.NEF pic-folder\20\C.jpg pic-folder\20\D.jpg
To delete NEF files:
def remove_file(fn, dryrun=False): assert os.path.isfile(fn) if dryrun: print "os.remove on:", fn else: os.remove(fn) def remove_nefs(adir, dryrun=False): """Remove NEF files from folder adir.""" assert os.path.isdir(adir) for p, dirs, files in os.walk(adir): for fn in files: if is_nef(fn): fullpath = os.path.join(p, fn) remove_file(fullpath, dryrun) remove_nefs("pic-folder")
After that, the file list of pic-folder should be:
pic-folder\10\A.jpg pic-folder\10\B.jpg pic-folder\20\C.jpg pic-folder\20\D.jpg