------------------------------------------------------------------------
iconv
------------------------------------------------------------------------
The original coding was ASCII (mapped from 0 to 127) was developed
for teletype (actually for telegraph). A single byte was sufficient
(with the upper half unused). Later IBM developed EBCDIC (also 1
byte) which did not include ASCII as a subset. There is now a need
to encode other languagees. All this has led to a large number of
character coding.

"iconv" allows you to map one coding to another. the number of code
schemes that iconv can handle is given by
$ iconv -l	#you should get impressed

The general form of usage is
$ iconv -f old-encoding -t new-encoding Infile > Outfile

The only use I have made of iconv is to get rid of all characters
other than "lower half of ASCII".

$ iconv -t ASCII -c Infile > Outfile 
-c ... characters which cannot be converted are silently discarded 
-t ... restrict the coding to standard ASCII

There are other solutions to this task
$ LC_ALL=C  tr -dc '\0-\177' < input_file >  output_file

note; no ";" after LC_ALL=C, even though you may be tempted to do so.
To learn more about "locale" setting
$ locale


BACKGROUND:
The first set of character coding was ASCII (American Standard Code
for Information Interchange) and traces its root back to days of
telegraphy. It had only had 128 characters mapped to [0,127]. 

$ man ascii

The first 32 characters (0-31) are  "control" characters and the
last character (126) are control characters and trace their heritage
to days of telegraphy (and teletype machines).  Of these the following
are still in use:

0 (NUL,\0,^@) marks the end of a string
7 (BEL, \a, ^G) bell
8 (BS, \b, ^H) backspace
9 (HT, \t, ^I) Horizontal tab
10 (LF, \n, ^J) Linefeed
11 (VT, \v, ^K) vertical tab
12 (FF, \f, ^L) form freed (eject paper, clear screen)
13 (CR, \r, ^M) carriage return 
26 (Control-Z, EOF, ^Z) 
27 (escape, ESC, ^[)
127 (delete, DEL, ^?)

Characters with values between 32 and 126 are used for symbols.