TransWikia.com

How to delete JPG files, but only if the matching RAW file exists?

Photography Asked by seanmc on March 13, 2021

My early (Canon G2) photos are all JPG, but when I got my Nikon D90 I initially shot in JPG, then switched to RAW+JPG, and now I would like to switch to RAW only.

I have literally thousands of photos on my HDD. The photos are in sub-directories (by date) under a single directory called Import.

I am about to import all these photos into Lightroom 3.0, however, I would like to delete all the JPG files, but only where there is already a corresponding RAW file (ie. I no longer want to keep JPG and RAW versions of the same file).

If I can do this easily within Lightroom (after importing everything, including the duplicate JPG files) that would be great. It would also be OK if there were an easy way to do this before importing the files (but hopefully this wouldn’t involve having to visit every directory looking for filenames with both JPG and NEF extensions).

Does anybody know of a way to do this (in Lightroom, or with some tool/script in Windows)?

11 Answers

On Windows, go to the folder, and run this in a command prompt:

for /f "delims==" %r in ('dir /b *.nef') do del "%~dpr%~nr.jpg" 2> nul

Basically, it goes through the current folder, runs through the NEF files, and deletes the JPG if present. It ignores any errors if the JPG is not there.

If you want subfolders, include /s in the dir command.

Correct answer by anon on March 13, 2021

  • Create an empty Library
  • From the Lightroom main menu, choose Edit > Preferences (Windows) or Lightroom > Preferences (Mac OS).
  • In the General preferences unselect "Treat JPEG Files Next To Raw Files As Separate Photos"
    • This should be the default.
  • Import all of your files (you can select search subfolders), telling it to move to a new location
  • The JPG files that have RAW files will be left in the original location for you to remove

As I understand it, the thumbnail in lightroom might say RAW+JPG, but the JPG isn't actually stored or accessible in any way.

You can also write a pretty simple batch script with any programming language.

Answered by Eruditass on March 13, 2021

Here's a Python script that moves JPG files when no corresponding RAW file exists. Useful on Mac OS X!

import os
import shutil

raw_ext = '.CR2'
jpg_ext = '.JPG'
destination = '/Users/JohnSmith/Desktop/jpgs/'

for filename in os.listdir('.'):
    (shortname, extension) = os.path.splitext(filename)

    if extension == raw_ext:
        if os.path.isfile(shortname + jpg_ext):
            print 'Moving ' + shortname + jpg_ext + '...'
            shutil.move(shortname + jpg_ext, destination)

Answered by ttaveira on March 13, 2021

I wrote the following Python script. Compared with ttaveira's script, it does some extra work.

  • Looks in sub-directories.
  • Creates destination waste directory.
  • Removes files that already exist in waste directory to avoid move errors.

# Script:      remove_jpg_if_raw_exists.py
#
# Description: This script looks in all sub directories for
#              pairs of JPG and RAW files.
#              For each pair found the JPG is moved to a
#              waste basket directory.
#              Otherwise JPG is kept.
#
# Author:      Thomas Dahlmann

import os, fnmatch

# define your file extensions here, case is ignored
raw_extension = "nef"
jpg_extension = "jpg"

# define waste basket directory here
waste_dir = "c:image_waste_basked"

##### do not modify below ##########

# recursive find files 
def locate(pattern, root=os.curdir):
    '''Locate all files matching supplied filename pattern 
    in and below root directory.'''
    for path, dirs, files in os.walk(os.path.abspath(root)):
        for filename in fnmatch.filter(files, pattern):
            yield os.path.join(path, filename) 

# get base names from raw's
raw_hash = {}
for raw in locate("*." + raw_extension):
    base_name = os.path.basename(raw)
    base_name = os.path.splitext(base_name)[0]
    raw_hash[base_name] = True

# make waste basket dir
if not os.path.exists(waste_dir):
    os.makedirs(waste_dir)

# find pairs and move jpgs of pairs to waste basket    
for jpg in locate("*." + jpg_extension):
    base_name = os.path.basename(jpg)
    base_name = os.path.splitext(base_name)[0]
    if base_name in raw_hash:
        jpg_base_name_with_ext = base_name + "." + jpg_extension
        new_jpg = waste_dir + "" + jpg_base_name_with_ext
        print "%s => %s" % (jpg, waste_dir)
        if os.path.exists(new_jpg):
            os.remove(jpg)
        else:
            os.rename(jpg, new_jpg)

Answered by Tomy on March 13, 2021

Here is a modified version of Tomy's Python script. Differences:

  • multiple raw extensions allowed
  • remove jpg only if the pairs are in the same folder (avoid accidental removal of a jpg named like a raw file in an other folder)
  • case insensitive

#!/usr/bin/env python
# Script:      remove_jpg_if_raw_exists.py
#
# Description: This script looks in all sub directories for
#              pairs of JPG and RAW files.
#              For each pair found the JPG is moved to a
#              waste basket directory.
#              Otherwise JPG is kept.
#
# Author:      Thomas Dahlmann
# Modified by: Renaud Boitouzet

import os
import shutil

# define your file extensions here, case is ignored.
# Please start with a dot.
# multiple raw extensions allowed, single jpg extension only
raw_extensions = (".Dng", ".cR2", ".nef", ".crw")
jpg_extension = ".jPg"

# define waste basket directory here. Include trainling slash or backslash.
# Windows : waste_dir = "C:pathtowaste"
waste_dir = "/Users/marvin/Pictures/waste/"

##### do not modify below ##########

# find files
def locate(folder, extensions):
    '''Locate files in directory with given extensions'''
    for filename in os.listdir(folder):
        if filename.endswith(extensions):
            yield os.path.join(folder, filename)

# make waste basket dir
if not os.path.exists(waste_dir):
    os.makedirs(waste_dir)

# Make search case insensitive
raw_ext = tuple(map(str.lower,raw_extensions)) + tuple(map(str.upper,raw_extensions))
jpg_ext = (jpg_extension.lower(), jpg_extension.upper())

root=os.curdir
#find subdirectories
for path, dirs, files in os.walk(os.path.abspath(root)):
    print path
    raw_hash = {}
    for raw in locate(path, raw_ext):
        base_name = os.path.basename(raw)
        base_name = os.path.splitext(base_name)[0]
        raw_hash[base_name] = True

    # find pairs and move jpgs of pairs to waste basket
    for jpg in locate(path, jpg_ext):
        base_name = os.path.basename(jpg)
        base_name = os.path.splitext(base_name)[0]
        if base_name in raw_hash:
            jpg_base_name_with_ext = base_name + jpg_extension
            new_jpg = waste_dir + jpg_base_name_with_ext
            print "%s: %s = %s => %s" % (path, base_name, jpg, waste_dir)
            if os.path.exists(new_jpg):
                os.remove(jpg)
            else:
                shutil.move(jpg, new_jpg)

Answered by Renaud B. on March 13, 2021

Here’s a bash script for Mac OS X. It may work on Linux with some changes.

#!/bin/bash
read -p "Delete JPEGs when DNG exists? Ctrl-C to cancel. [Enter] to continue: "

for FILE in *.dng; do
  JPG_FILE=$(echo "$FILE" | sed "s/dng/jpg/g")
  rmtrash "${JPG_FILE}" 1>/dev/null
done

rmtrash is a utility that moves files to the Trash, instead of deleting them outright. You can get it from MacPorts thus:

sudo port install rmtrash

If you’d like to avoid that, just replace rmtrash in the script with rm, which will immediately delete the JPG files.

Answered by Manas Tungare on March 13, 2021

Here is a solution for bash (Linux or Mac OS X). On Windows, you can install Cygwin to get a copy of bash.

keep=$(ls | grep -v ps | grep -A1 JPG | grep NEF)
for i in $keep ; do
   mv $i $i.keep
done

ls | egrep -v '(JPG|keep)' | xargs rm -f

change=$(ls | grep keep | sed 's/.keep//g')
for i in $change ; do
   mv $i.keep $i
done

Answered by Ben Pingilley on March 13, 2021

Here is another bash version using find (Linux). As with Ben Pingilley's answer, you can install Cygwin to get bash on Windows.

#!/bin/bash
read -p "please enter file suffix for raw format (e.g ORF, NEF, CR2): " suffix

find . -type f -iname "*.${suffix}" | 
while read line
do
  lowercase=$(echo "$line" | sed "s/${suffix}/jpg/gi")
  uppercase=$(echo "$line" | sed "s/${suffix}/JPG/gi")

  if [ -f "${lowercase}" ]
  then
    rm -v "${lowercase}"
  elif [ -f "${uppercase}" ]
  then
    rm -v "${uppercase}"
  else
    echo "${line}: no jpg present"
  fi
done

Answered by bsod on March 13, 2021

Working on Mac OS X, I was missing a sanity check for "same content" in the previous answers. I had duplicate names for different pictures because I had forgotten to enable the image counter in my camera. Here's my version, which checks the EXIF information for same capture time:

You need to run

sudo port install rmtrash exiv2

before you can use the following command. It was written to compare JPG with NEF files from my Nikon D90. Adjust the file extensions according to your needs.

find . -name *.NEF |sed s/.NEF/.JPG/g | xargs find 2>/dev/null | 
xargs perl -e 'foreach(@ARGV) {my $jpg=$_;my $nef=s/.JPG/.NEF/r; my $tjpg = `exiv2 -g Exif.Photo.DateTimeOriginal -pt $jpg`; my $nef=s/.JPG/.NEF/r; my $tnef = `exiv2 -g Exif.Photo.DateTimeOriginal -pt $nef`; if($tjpg eq $tnef) {print "$jpgn"}}' | 
xargs rmtrash

without the sanity check, the whole thing would become a very short one liner:

find . -name *.NEF |sed s/.NEF/.JPG/g | xargs find 2>/dev/null | xargs rmtrash

Answered by André Pareis on March 13, 2021

Here's my take on this issue. A lot of good ideas came from earlier scripts mentioned here.

This is a bash script for OS X. It looks for files that exist with same base filename and dng+jpg extensions. If a jpg is found with an exactly same name as dng, then that filename is displayed (-e), file is moved (-m) or deleted (-d).

It will go through subfolders, so you could use it for your entire catalog or just parts of it.

For other raw file extensions just substitute *.dng in the script with your preferred extension.

Warning: You could have two different images with same name, but different extension. Those are inevitable casualties of this script.

Here's how to use the script:

Usage: dng-jpg.sh [-m <path>] [-d <path>] [-e <path>] [-h]

-m: for move   (moves files to <path>/duplicates)
-d: for delete (deletes duplicate files)
-e: for echo   (lists duplicate files)
-h: for help 

Basic usage would work like this:

$ ./dng-jpg.sh -e /Volumes/photo/DNG/2015

That would echo all filenames of jpg files that match the criteria of having both dng and jpg file with same name.

Result would look like something like this:

Echo selected with path: /Volumes/photo/DNG/2015
/Volumes/photo/DNG/2015/03/18/2015-03-18_02-11-17.jpg
/Volumes/photo/DNG/2015/06/01/2015-06-01_05-10-50.jpg
/Volumes/photo/DNG/2015/06/01/2015-06-01_05-10-56.jpg
/Volumes/photo/DNG/2015/06/01/2015-06-01_05-11-39.jpg
/Volumes/photo/DNG/2015/06/01/2015-06-01_05-11-54.jpg
/Volumes/photo/DNG/2015/06/01/2015-06-01_05-12-26.jpg
/Volumes/photo/DNG/2015/06/01/2015-06-01_05-12-43.jpg
/Volumes/photo/DNG/2015/06/01/2015-06-01_05-13-21.jpg
/Volumes/photo/DNG/2015/06/01/2015-06-01_05-13-56.jpg
9 files found.

Now if I want to delete the files I would just switch the -e to -d:

$ ./dng-jpg.sh -d /Volumes/photo/DNG/2015

Or if I'd like to move the files to /duplicates I'd execute it with -m.

$ ./dng-jpg.sh -m /Volumes/photo/DNG/2015

Now the duplicate jpg files would be in /Volumes/photo/DNG/2015/duplicates

Here's the script: dng-jpg.sh

#!/bin/bash

# Init variables
isSetM=0
isSetD=0
isSetE=0
isSetCount=0
counter=0

#Display usage info
usage() {

    cat <<EOF

Usage: dng-jpg.sh [-m <path>] [-d <path>] [-e <path>] [-h]

-m: for move   (moves files to <path>/duplicates)
-d: for delete (deletes duplicate files)
-e: for echo   (lists duplicate files)
-h: for help 

EOF
  exit 1
}

#Check for parameters
while getopts ":m:d:e:h" opt; do
  case ${opt} in
    m)
        isSetM=1
        let isSetCount="$isSetCount+1"
        arg=${OPTARG}
      echo "Move selected with path:" $arg
      ;;
    d)
        isSetD=1
        let isSetCount="$isSetCount+1"
        arg=${OPTARG}
      echo "Delete selected with path:" $arg
      ;;
    e)
        isSetE=1
        let isSetCount="$isSetCount+1"
        arg=${OPTARG}
      echo "Echo selected with path:" $arg
      ;;
    h)
        let isSetCount="$isSetCount+1"
        usage
      ;;
    ?)
      echo "Invalid option: -$OPTARG" >&2
      usage
      ;;
    :)
      echo "Option -$OPTARG requires a directory argument." >&2
      usage
      ;;
    *)
      usage
      ;;
  esac
done

# If no parameters, show usage help and exit
if test -z "$1"; then
    usage
fi

# If multiple parameters (not counting -a), show usage help and exit
if (($isSetCount > 1)); then
    usage
fi

#Verify directory
if [ ! -d "$arg" ]; then
  echo "$arg is not a path to a directory." >&2
  usage
fi

#Now set it as a basedir
BASEDIR=$arg
WASTEDIR="$BASEDIR/duplicates/"
if (( $isSetM==1 )); then
    mkdir $WASTEDIR
fi

for filename in $(find $BASEDIR -name '*.dng' -exec echo {} ; | sort); do
   prefix=${filename%.dng}
    if [ -e "$prefix.jpg" ]; then
        let counter="$counter+1"
        if (( $isSetE==1 )); then
            echo "$prefix.jpg"
        fi
        if (( $isSetM==1 )); then
            mv $prefix.jpg $WASTEDIR
        fi
        if (( $isSetD==1 )); then
            rm $prefix.jpg
        fi
    fi
done

echo "$counter files found."

Answered by T. Toivonen on March 13, 2021

I like the bash script for OS X (by T.Toivonen), but I have noticed there are a few issues.

  • It did not like my directory names, which contain spaces. That required a bit different handling of the find command.

  • The original script only works for lowercase extensions. I have slightly improved that part of the script to account for extensions that are in uppercase as well. Note, that it only accepts DNG+JPG or dng+jpg pairs, and it will ignore any combinations such as DNG+jpg or DnG+JpG.

  • The original solution proposed only one wastedir location, whereas my fix allows a subdirectory to be created on each directory branch as it travels though. You define a name of the directory before the loop.

  • I like to see what's going on, especially when mv or rm commands are used ;)

For the sake of the space I am showing only the last part of the script, from setting a basedir, wastedir and the loop.

[...]

#Now set it as a basedir
BASEDIR=$arg
WASTEDIR=duplicates
find "$BASEDIR" -iname '*.dng' -print0 | while read -d $'' filename 
    do
    filepath="${filename%/*}"
    basename="${filename##*/}"
    prefix="${basename%%.*}"
    suffix=${filename##*.}
    if [[ "$suffix" =~ [A-Z] ]]; then rsuffix="JPG"; else rsuffix="jpg"; fi 
    if [ -e "$filepath/$prefix.$rsuffix" ]; then
        let counter="$counter+1"
        if (( $isSetE==1 )); then
            echo "FOUND: $filepath/$prefix.$rsuffix"
        fi
        if (( $isSetM==1 )); then
            echo "Moving $filepath/$prefix.$rsuffix to $filepath/$WASTEDIR"
            if [ ! -d "$filepath/$WASTEDIR" ]; then mkdir "$filepath/$WASTEDIR"; fi
            mv "$filepath/$prefix.$rsuffix" "$filepath/$WASTEDIR"
        fi
        if (( $isSetD==1 )); then
            echo "Removing duplicate $filepath/$prefix.$rsuffix"
            rm "$filepath/$prefix.$rsuffix"
        fi
    fi
done

Answered by Filip Wolak on March 13, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP