Mac GeekeryGet your geek on. |
|
blog advertising is good for you
recent popular content
User login
|
This is a quickie. On the Mac you regularly handle files with spaces in the Finder without issue, and even on the command line when you put quotes around it or let tab-completion escape it properly. However, if you try to do things in a shell script, like a for loop, on filenames that involve a space you’re going to hit a wall. For splits items on a space, regardless of it they’re quoted (if they’re stored in a variable). However, the read command does not. Observe.
find ~ -name '* *' | while read FILE
do
echo $FILE rocks.
done
And that’s that. Run the command and pipe to the while stanza and it works like a charm.
About Adam Knight
Author Biography Adam Knight is one of the founders of Mac Geekery and is a geek at heart. Programmer by day, hacker by night, his daily life revolves around the Macintosh platform, which he has been a user and programmer for since the early days of System 7 when his LCII replaced his Apple //c. In-between tech jobs, he’s managed to learn the basics of any web hacker: PHP, MySQL, Perl, Apache, Linux, *BSD, and the intricacies of ./configure —prefix=~/bombshelter/. Today, codepoet is concentrating on blogging again, writing some software for the Mac by himself (including Notae) and for his company (such as Switchblade) and has a few other toys coming out soon. Bug him over AIM or email [link fixed]. |
That example is faulty. Your problem does not show itself with the * operator, as the following will attest:
for file in * do echo "$file" rocks doneThat will have the exact same output. Your problem is with a variable containing newline-delimited files, like such:
In any case, that’s not necessary. If you set IFS to
$'\n'then it will only split on newlines, not spaces. And you can enclose the entire thing in a subshell using( foo )to scope IFS to just that for loop.Example:
( IFS=$'\n' for file in $files do echo "$file" rocks. done )The example you gave will, indeed, show the issue if there are files with spaces. If listing a folder with “Application Support” inside you’ll get:
As for the newlines, the variable does not contain them. The shell, or something, makes them spaces such that “echo $FILES” will give you a space-delimited list of names with a space. I suggest you test it.
—
cp
The variable does contain the newlines; I suspect you’re leaving off the quotes somewhere important.
We now have a file with newlines in its name, and one with spaces; the cat was required to keep ls from being helpful and filtering out unprintable characters.
Now, we try the original suggestion:
Well, that worked for files with spaces, but not with newlines, so it’s still dangerous.
Now, we try the simplest possible method, but with proper quoting, as Eridius suggested.
Hey, that worked perfectly.
You misunderstand the original issue. I’m taking the output of find and looping over it. for will always break in this case and your example does nothing to resolve that. This is not a matter of quoting but a matter of where for breaks up results.
Even if I saved the output of find in something like $FILES it would have the same effect. read, however, works perfectly fine there.
—
cp
The read solution isn’t a solution, though; it still fails on filenames with newlines. And the problem isn’t with for itself, it’s with word separation.
“Don’t use find” is one solution, as Eridius and I have shown, or “use find’s -0 safety option” as Rae showed.
Saying “For splits items on a space, regardless of it they’re quoted (if they’re stored in a variable).” is just incorrect and confusing.
Obviously quotes do matter.
They matter on the echo line, too; without the quotes, the arguments to echo are resplit, losing any non-single-space separators:
This really isn’t a matter for a holy war. Not using find is not the answer when find is the program you need to use and parse the results of. While filesystems allow new-lines in the filenames, it’s rare and stupid enough that I don’t care if it breaks on those files.
For the love of Steve, all I’m doing is showing a way to use find with filenames with spaces in it because for loops don’t work. It’s handy, it works, and when it’s needed it’s damned useful. Adding nulls and piping to xargs is handy when possible, but when you need to perform a number of tricks on a file that find found, that’s just not very appropriate. Possible, yes, but not appropriate.
This is an 80% solution. Hell, it’s a 95% solution. In this specific case, a for loop is a 0% solution because of how it interprets the output of find. Thus, this is better than nothing for what I was trying to do and I felt cause to share it. If you think something else is better then do something else.
To properly handle spaces, you need to make sure
$FILEis double-quoted anywhere it is used. Inside the for-loop, it should sayecho "$FILE". Otherwise, bash thinks that you are trying to pass multiple arguments toecho. The difference would be apparent if a filename had multiple consecutive spaces or if you were performing an operation like cp instead ofecho.Ahh, very true. My intent was to demo the looping more than run an actual command, but if you were running a command later you would certainly need the quotes around the variable.
—
cp
find . -name '* *' -print0 | xargs -0 -n 1 -I % echo % rocksYou need the
"-print0"and"-0"to delimit things with zeros instead of spaces.You need
"-n 1"to do them one at a time instead of as many as will fit in a command line.You need
"-I %"to put the argument not at the end of the command,This is the most robust and elegant solution for my purposes, thanks for the tip!
www.scrambledchannel.org
If my
findpiped toxargssolution doesn’t allow for complex enough operations, you can use a shell function. e.g.:$ lsd() { for i in "$@"; do; ls -ld "$i"; done; } $ find . -name '* *' -print0 | xargs -0 lsdNote that no
"-n 1"or"-I %"arguments to find are needed, since that is all handled inside the shell function, which can also span multiple lines if needs be.rae wrote:
That does not work in a shell script sir.
(I really wish it did). Alas, it merely barks…
Nice try.
you need to make the subroutine, as rae did, on the line before.
lsd() { for i in ā$@ā; do; ls -ld ā$iā; done; }
and make sure your shebang line reads #!/bin/bash.
First off, the suggestion to “not use find” was absolutely absurd. Get real.
Second, the context here is “ if you try to do things in a shell script” and that’s
somewhat different from just typing commands in Terminal. The problem with
piping to xargs -0 ( or even using -exec {} \; ) is that — in a shell script — they
expect some command in the environment’s $PATH to execute. But how can we
make a self-contained script… and have the find results sent to a function inside
the script??? Something like
xargs -0 myScriptFunctiondoesn’t fly.But…
find blah blah blah | myScriptFunctionwhere myScriptFunction() is built with:while IFS= read -r inputListItem do whatever we want with "$inputListItem" donedoes work.
As has been noted, the only “drawback” is filenames with newline chars (and,
as I’ve discovered \ backslashes as well) don’t parse properly. Not a mission
critical issue for most users, I wouldn’t imagine. This was a great tip… which
(unfortunately) got taken out of context.
FOR what it’s worth,
HIEDIT: a little googling has revealed that adding the -r option to read
will cure the backslash problem. i.e.,
while read -rEDIT#2: YIKES! Speaking of spaces: if an item’s name ends with a space,
read again misbehaves. More googling produced this sublime solution:
while IFS= read -rSeems like the field separator only affects the one read line (not the entire do loop).
?
Wow, I bet Adam is sorry he brought this up. No good deed goes unpunished. Good discussion anyway.
I was able to use this information to solve my unix-newb problem, which was how to loop over files that included spaces in their names.
for f in c:/somepath/z*.html; do
md5sum “$f”
done
works fine for me, while my original
for f in ‘ls -1 c:/somepath/z*.shtml‘; do
md5sum “$f”
done
dies in flames due to the way ‘for’ splits.
Thanks everyone.
-==-The real problem is that some supergenius decided to allow whitespace in file names in the first place
Hmmm…. I think there is still a problem when trying to script. Lets say that you are passing in an argument list, where the arguments are files that have spaces.
% do_something_to_files /Users/me/directory\ with\ spaces/file1 /Users/me/directory\ with\ spaces/file2
First, you would probably want to shift away $1. Your argument list, $*, would then have a bunch of text and spaces. How would you loop through the files?
No, that’s trivial:
do_something:
#!/bin/sh for arg do echo "$arg" doneThat’s it!
luke
I just tested the original assertion under bash on the Mac (and on Linux), since it sounded quite wrong to me.
The test confirmed that it was indeed wrong.
“for” does not break tokens with spaces. (The shell does this job, not each of the built-in commands, like “for”!) Also, the shell “tokenises” only before command execution and after variable expansion. It does it after variable expansion so that you can do things like: for f in $list …
In short, the string
for f in *is broken into 4 tokens; the*is expanded into all the files; and that’s that. Filenames with spaces or whatever are not then further broken up.You must quote any variable when you use it to avoid exposing yourself to this problem. That’s probably what tripped you up and confused you. I.e. it’s bad shell programming to ever use an unquoted variable, unless you specifically want the resulting variable to be tokenised when it has spaces.
You’ve made the thing into a much bigger problem than it really is!
luke
no good answers
need better shell
hmm, i didn’t know we had this many different users of this site
better shell, Hmm is zsh better?
It seems that several people came up with fine solutions, even options for using find, or not. what is the problem? These aren’t new questions, they are old, very well documented UNIX questions. Read the man page for find. Here are some other useful options for doing other things (not using loops, for quick solutions):
-delete - delete all found files -execdir command {} \; - run specific command on every found file (in dir of file) -okdir command {} \; - just like execdir, asking user if they are surehere’s 2 ways to delete every file that has evil in the name:
find -name '*evil*' -delete find -name '*evil*' -okdir rm {} \;Look here: http://wooledge.org:8000/BashFAQ
find . -print0 | while read -d $’\0’ file; do mv “$file” “${file// /_}”; done
This discussion helped me achieve what I wanted.
Identify, Quantify, and then Eliminate dumb temporary files from Microsoft Office applications.
What are the files? (human readable list, newline delimited, assume no files with newlines in the name)
How much space do these files consume?
Delete them.
For what it’s worth I owe a big debt of gratitude to the original author. When in a bind, that solution worked well for me in cygwin on Windows Vista. I didn’t understand any of the comments below that.