Daily Vim: Text Editor Tips, Tricks, Tutorials, and HOWTOs: Sort + De-dupe? Easy.

Thursday, March 27, 2008

Sort + De-dupe? Easy.

Sometimes when dumping data, it makes more sense from a performance perspective to not worry about removing duplicate data when constructing a SQL query and do it after the fact instead. This is accomplished really easy on the shell as follows:

cat filename.csv | sort --buffer-size=32M | uniq > filename_uniq.csv

You can omit the buffer-size argument to sort in favor of the default size or set it to whatever you want.

1 comment:

graywh said...: Even better, sort has a -u option so you don't have to use uniq.; June 24, 2008 at 10:27 AM

Daily Vim: Text Editor Tips, Tricks, Tutorials, and HOWTOs

Thursday, March 27, 2008

Sort + De-dupe? Easy.

1 comment:

Donate

Blog Archive

Daily Vim: Text Editor Tips, Tricks, Tutorials, and HOWTOs

Thursday, March 27, 2008

Sort + De-dupe? Easy.

1 comment:

Donate

Blog Archive

Subscribe To