How to Read a File and Sort It by Length in Java and Output to Another File
![]() The | |
Original author(south) | Ken Thompson (AT&T Bell Laboratories) |
---|---|
Developer(s) | Various open-source and commercial developers |
Initial release | November 3, 1971 (1971-eleven-03) |
Operating organization | Multics, Unix, Unix-like, V, Program ix, Inferno, MSX-DOS, IBM i |
Platform | Cross-platform |
Type | Command |
License | coreutils: GPLv3+ |
In computing, sort is a standard control line program of Unix and Unix-like operating systems, that prints the lines of its input or chain of all files listed in its argument list in sorted order. Sorting is done based on one or more than sort keys extracted from each line of input. By default, the entire input is taken as sort key. Blank space is the default field separator. The command supports a number of control-line options that tin vary by implementation. For case the "-r
" flag will reverse the sort guild.
History [edit]
A sort
control that invokes a general sort facility was first implemented within Multics.[1] Later, it appeared in Version ane Unix. This version was originally written by Ken Thompson at AT&T Bong Laboratories. By Version four Thompson had modified information technology to use pipes, merely sort retained an option to name the output file considering information technology was used to sort a file in place. In Version 5, Thompson invented "-" to represent standard input.[2]
The version of sort bundled in GNU coreutils was written by Mike Haertel and Paul Eggert.[three] This implementation employs the merge sort algorithm.
Like commands are available on many other operating systems, for example a sort command is role of ASCII's MSX-DOS2 Tools for MSX-DOS version 2.[4]
The sort command has also been ported to the IBM i operating arrangement.[five]
Syntax [edit]
sort [OPTION]... [FILE]...
With no FILE
, or when FILE
is -
, the control reads from standard input.
Parameters [edit]
Name | Description | Unix | Program 9 | Inferno | FreeBSD | Linux | MSX-DOS | IBM i |
---|---|---|---|---|---|---|---|---|
-b, --ignore-leading-blanks | Ignores leading blanks. | Yes | Yes | No | Yes | Aye | No | Aye |
-c | Check that input file is sorted. | No | Yes | No | Yes | Aye | No | Yeah |
-C | Like -c, simply does non written report the first bad line. | No | No | No | Yes | Yes | No | No |
-d, --dictionary-guild | Considers simply blanks and alphanumeric characters. | Aye | Yes | No | Yes | Yeah | No | Yes |
-f, --ignore-case | Fold lower case to upper case characters. | Yeah | Aye | No | Yes | Yeah | No | Yes |
-thousand, --general-numeric-sort, --sort=general-numeric | Compares co-ordinate to general numerical value. | Aye | Yes | No | Yeah | Yes | No | No |
-h, --homo-numeric-sort, --sort=human-numeric | Compare human readable numbers (e.g., 2K 1G). | Yep | No | No | Yes | Yep | No | No |
-i, --ignore-nonprinting | Considers but printable characters. | Yes | Aye | No | Yeah | Yes | No | Yes |
-k, --key= POS1 [, POS2 ] | Offset a key at POS1 (origin 1), cease it at POS2 (default finish of line) | No | No | No | Yes | Yes | No | No |
-m | Merge only; input files are causeless to be presorted. | No | Yep | No | Yep | Yeah | No | Yeah |
-Thou, --calendar month-sort, --sort=calendar month | Compares (unknown) < 'January' < ... < 'Dec'. | Yes | Yes | No | Aye | Yes | No | No |
-n, --numeric-sort, --sort=numeric | Compares co-ordinate to string numerical value. | Yeah | Yes | Aye | Yes | Yes | No | Yes |
-o OUTPUT | Uses OUTPUT file instead of standard output. | No | Yeah | No | Aye | Yes | No | Yeah |
-r, --reverse | Reverses the issue of comparisons. | Yes | Yes | Aye | Yes | Yes | No | Yeah |
-R, --random-sort, --sort=random | Shuffles, but groups identical keys. Encounter also: shuf | Yes | No | No | Yes | Yes | No | No |
-due south | Stabilizes sort by disabling terminal-resort comparison. | No | No | No | Yep | Yes | No | No |
-S size, --buffer-size= size | Use size for the maximum size of the retentiveness buffer. | No | No | No | Yes | No | No | No |
-tx | 'Tab character' separating fields is 10. | No | Yeah | No | No | Yes | No | Aye |
-t char, --field-separator= char | Uses char instead of non-blank to bare transition. | No | No | No | Yes | Yes | No | No |
-T dir, --temporary-directory= dir | Uses dir for temporaries. | No | Yep | No | Aye | Yes | No | No |
-u, --unique | Unique processing to suppress all just ane in each set of lines having equal keys. | No | Aye | No | Yes | Yeah | No | Aye |
-5, --version-sort | Natural sort of (version) numbers within text | No | No | No | Yes | Yes | No | No |
-w | Similar -i, merely ignore only tabs and spaces. | No | Yep | No | No | No | No | No |
-z, --zero-terminated | Stop lines with 0 byte, not newline | No | No | No | Yes | Yes | No | No |
--help | Display help and exit | No | No | No | Yes | Yes | No | No |
--version | Output version data and exit | No | No | No | Yes | Yeah | No | No |
/R | Reverses the event of comparisons. | No | No | No | No | No | Yep | No |
/S | Specify the number of digits to determine how many digits of each line should exist judged. | No | No | No | No | No | Yes | No |
/A | Sort past ASCII lawmaking. | No | No | No | No | No | Yeah | No |
/H | Include hidden files when using wild cards. | No | No | No | No | No | Yes | No |
Examples [edit]
Sort a file in alphabetical society [edit]
$ cat phonebook Smith, Brett 555-4321 Doe, John 555-1234 Doe, Jane 555-3214 Avery, Cory 555-4132 Fogarty, Suzie 555-2314
$ sort phonebook Avery, Cory 555-4132 Doe, Jane 555-3214 Doe, John 555-1234 Fogarty, Suzie 555-2314 Smith, Brett 555-4321
Sort past number [edit]
The -n
option makes the plan sort co-ordinate to numerical value. The du control produces output that starts with a number, the file size, and so its output can be piped to sort to produce a list of files sorted by (ascending) file size:
$ du /bin/* | sort -n 4 /bin/domainname 24 /bin/ls 102 /bin/sh 304 /bin/csh
The find command with the ls pick prints file sizes in the 7th field, and then a list of the LaTeX files sorted by file size is produced by:
$ detect . -name "*.tex" -ls | sort -k 7n
Columns or fields [edit]
Use the -k
option to sort on a certain cavalcade. For instance, use "-1000 ii
" to sort on the second column. In old versions of sort, the +1
option made the program sort on the second cavalcade of data (+2
for the third, etc.). This usage is deprecated.
$ true cat zipcode Adam 12345 Bob 34567 Joe 56789 Sam 45678 Wendy 23456
$ sort -m 2n zipcode Adam 12345 Wendy 23456 Bob 34567 Sam 45678 Joe 56789
Sort on multiple fields [edit]
The -grand m,n
option lets y'all sort on a key that is potentially equanimous of multiple fields (beginning at column yard
, end at cavalcade n
):
$ cat quota fred 2000 bob 1000 an 1000 chad m don 1500 eric 500
$ sort -k2,2n -k1,1 quota eric 500 an m bob 1000 chad 1000 don 1500 fred 2000
Here the first sort is done using column two. -k2,2n
specifies sorting on the fundamental starting and ending with column 2, and sorting numerically. If -k2
is used instead, the sort key would begin at column two and extend to the finish of the line, spanning all the fields in betwixt. -k1,ane
dictates breaking ties using the value in column 1, sorting alphabetically past default. Note that bob, and chad accept the same quota and are sorted alphabetically in the final output.
Sorting a piping delimited file [edit]
$ sort -k2,2,-k1,1 -t'|' zipcode Adam|12345 Wendy|23456 Sam|45678 Joe|56789 Bob|34567
Sorting a tab delimited file [edit]
Sorting a file with tab separated values requires a tab character to be specified as the cavalcade delimiter. This illustration uses the shell's dollar-quote notation[6] [7] to specify the tab every bit a C escape sequence.
$ sort -k2,2 -t $'\t' phonebook Doe, John 555-1234 Fogarty, Suzie 555-2314 Doe, Jane 555-3214 Avery, Cory 555-4132 Smith, Brett 555-4321
Sort in reverse [edit]
The -r
option just reverses the gild of the sort:
$ sort -rk 2n zipcode Joe 56789 Sam 45678 Bob 34567 Wendy 23456 Adam 12345
Sort in random [edit]
The GNU implementation has a -R --random-sort
option based on hashing; this is non a full random shuffle because information technology volition sort identical lines together. A true random sort is provided past the Unix utility shuf.
Sort past version [edit]
The GNU implementation has a -5 --version-sort
option which is a natural sort of (version) numbers within text. Two text strings that are to be compared are split into blocks of letters and blocks of digits. Blocks of messages are compared alpha-numerically, and blocks of digits are compared numerically (i.east., skipping leading zeros, more than digits means larger, otherwise the leftmost digits that differ determine the result). Blocks are compared left-to-right and the get-go non-equal block in that loop decides which text is larger. This happens to piece of work for IP addresses, Debian package version strings and similar tasks where numbers of variable length are embedded in strings.
Come across likewise [edit]
- Collation
- List of Unix commands
- uniq
- shuf
References [edit]
- ^ "Multics Commands". world wide web.multicians.org.
- ^ McIlroy, G. D. (1987). A Research Unix reader: annotated excerpts from the Programmer's Manual, 1971–1986 (PDF) (Technical report). CSTR. Bong Labs. 139.
- ^ "sort(1): sort lines of text files - Linux man page". linux.die.net.
- ^ "MSX-DOS2 Tools User'southward Manual - MSX-DOS2 TOOLS ユーザーズマニュアル". April 1, 1993 – via Internet Annal.
- ^ IBM. "IBM System i Version vii.2 Programming Qshell" (PDF) . Retrieved 2020-09-05 .
- ^ "The GNU Bash Reference Manual, for Bash, Version four.ii: Section three.1.ii.iv ANSI-C Quoting". Free Software Foundation, Inc. 28 December 2010. Retrieved i February 2013.
Words of the class $'string' are treated especially. The word expands to string, with backslash-escaped characters replaced as specified by the ANSI C standard.
- ^ Fowler, Glenn S.; Korn, David 1000.; Vo, Kiem-Phong. "KornShell FAQ". Archived from the original on 2013-05-27. Retrieved 3 March 2015.
The $'...' string literal syntax was added to ksh93 to solve the problem of entering special characters in scripts. Information technology uses ANSI-C rules to interpret the string between the '...'.
Farther reading [edit]
- Shotts (Jr), William E. (2012). The Linux Command Line: A Complete Introduction. No Starch Press. ISBN978-1593273897.
- McElhearn, Kirk (2006). The Mac OS X Command Line: Unix Nether the Hood. John Wiley & Sons. ISBN978-0470113851.
External links [edit]
- Original Sort manpage The original BSD Unix program'due south manpage
- – Linux User Manual – User Commands
- – Plan nine Developer's Manual, Volume 1
- – Inferno General commands Transmission
- Further details virtually sort at Softpanorama
boughnercablecony.blogspot.com
Source: https://en.wikipedia.org/wiki/Sort_%28Unix%29
0 Response to "How to Read a File and Sort It by Length in Java and Output to Another File"
Post a Comment