12/21 uniq can get rid of 2 identical lines if they occur right after
each other. But how do you get rid of 2 identical lines, even
if they don't occur right after each other. using sort works,
but then all the lines are out of order, which is a problem.
\_ perl. counters for each unique pattern so far. Hell, you
can do it using a temp file with just /bin/sh scripts.
\_ perl -ne '$m{$_}++||print' <file>
this does the uniq thing, not kill all duplicates. -vadim
\_ do it scalably w/ bash, e.g. let the sort/uniq tools do
the heavy lifting:
n=0
while read line ; do echo "$n $line" ; n=$(($n + 1)); done \
| sort -k 2 | uniq -f 1 | sort -n \
| while read num rest ; do echo "$rest" ; done
\_ cat -n <file> | sort -uk 1.8 | sort | cut -c8- -vadim
\_ to do what you really asked, you can replace sort -uk 1.8 with
sort -k 1.8 | uniq -uf1. -vadim
\_ another one (zsh):
typeset -A m; while read l; do [ $m[$l] ] || echo $l && \
m[$l]=1; done -vadim
\_ /tmp/unique.c is something I wrote on SunOS5 a few years ago.
--- yuen
\_ waaaay unsafe. The least you could do is store md5s in the
hash. -vadim
\_ It's just some quick utility I came up with to discard
duplicated path names. It wasn't meant to be secure. --- yuen |