Server Fault Asked by Zulakis on December 13, 2021
I got a string like the following:
test.de. 1547 IN SOA ns1.test.de. dnsmaster.test.de. 2012090701 900 1000 6000 600
now I want to replace all the tabs/spaces inbetween the records with just a single space so I can easily use it with cut -d " "
I tried the following:
sed "s/[t[:space:]]+/[:space:]/g"
and various varions but couldn’t get it working. Any ideas?
Here are some interesting methods I found via experiments (using xxd to see tabs).
echo -e \033c
s=$(echo -e "attbttcttdttetf")
echo 'original string with tabs:'
echo "$s"
echo "$s" | xxd
echo -e 'nusing: techo "$s" | tr -s \\t " "'
echo "$s" | tr -s \t " "
echo "$s" | tr -s \t " " | xxd
echo -e 'nusing: techo "$s" | sed '"'s/\\t/ /g'"
echo "$s" | sed 's/t+/ /g'
echo "$s" | sed 's/t+/ /g' | xxd
echo -e 'nusing: techo ${s/ / }'
echo ${s/ / }
echo ${s/ / } | xxd
z=$(echo $s)
echo -e 'nusing: tz=$(echo $s); echo "$z"'
echo "$z"
echo "$z" | xxd
echo -e 'nusing: tread s < file.in; echo $s'
read s < file.in
echo $s
echo $s | xxd
echo -e 'nusing: twhile read s; do echo $s; done'
while read s;
do
echo $s
done < file.in
Answered by cognativeorc on December 13, 2021
You can use the -s
("squeeze") option of tr
:
$ tr -s '[:blank:]' <<< 'test.de. 1547 IN SOA ns1.test.de. dnsmaster.test.de. 2012090701 900 1000 6000 600'
test.de. 1547 IN SOA ns1.test.de. dnsmaster.test.de. 2012090701 900 1000 6000 600
The [:blank:]
character class comprises both spaces and tabs.
Answered by Benjamin W. on December 13, 2021
I like using the following alias for bash. Building on what others wrote, use sed to search and replace multiple spaces with a single space. This helps get consistent results from cut. At the end, i run it through sed one more time to change space to tab so that it's easier to read.
alias ll='ls -lh | sed "s/ +/ /g" | cut -f5,9 -d" " | sed "s/ /t/g"'
Answered by CNS Security miked on December 13, 2021
Use sed -e "s/[[:space:]]+/ /g"
Here's an explanation:
[ # start of character class
[:space:] # The POSIX character class for whitespace characters. It's
# functionally identical to [ trnvf] which matches a space,
# tab, carriage return, newline, vertical tab, or form feed. See
# https://en.wikipedia.org/wiki/Regular_expression#POSIX_character_classes
] # end of character class
+ # one or more of the previous item (anything matched in the brackets).
For your replacement, you only want to insert a space. [:space:]
won't work there since that's an abbreviation for a character class and the regex engine wouldn't know what character to put there.
The +
must be escaped in the regex because with sed's regex engine +
is a normal character whereas +
is a metacharacter for 'one or more'. On page 86 of Mastering Regular Expressions, Jeffrey Friedl mentions in a footnote that ed and grep used escaped parentheses because "Ken Thompson felt regular expressions would be used to work primarily with C
code, where needing to match raw parentheses would be more common than backreferencing." I assume that he felt the same way about the plus sign, hence the need to escape it to use it as a metacharacter. It's easy to get tripped up by this.
In sed you'll need to escape +
, ?
, |
, (
, and )
. or use -r to use extended regex (then it looks like sed -r -e "s/[[:space:]]+/ /g"
or sed -re "s/[[:space:]]+/ /g"
Answered by Starfish on December 13, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP