CSCI 1130A/B Introduction to Computing Using Java 2018-2019 First Term
Department of Computer Science and Engineering
The Chinese University of Hong Kong
Assignment 4: Encoding ASCII Art
Due date: 7 November 2018 (Wed) Full mark: 100
Expected normal time spent on Tasks 1 and 2: 8-10 hours
Task 3: hours
Aims: 1. To encode and decode pictures of ASCII art using run-length encoding.
2. Practise reading/writing text files.
3. Practise the use of String and related methods.
BACKGROUND
ASCII Arts
ASCII stands for American Standard Code for Information Interchange. It is a character encoding standard
for electronic communication. The character set consists of 95 printable characters together with other special
non-printable characters, making up a total of 128 characters in the standard 7-bit ASCII code.
ASCII art is a graphic design technique that uses printable ASCII characters to create pictures. It is in
general regarded as text-based visual art. ASCII art was invented at around 1966. It is mainly because early
printers often lacked graphics ability. Therefore, characters were used in place of graphic marks. Below is the
picture of a building making up with ASCII characters.
.
. |~~ .
. ___|___ .
((((())))
. (((((())))) .
|-------------|
+ I_I_I_I_I_I_I_I_I +
(() |---------------| (()
|---| ||-| |-| |-| |-|| |---|
_________|-----|_|---------------|_|-----|_________
I_I_I_I_I_I_I_I|I_I_I_I_I_I_I_I_I_I|I_I_I_I_I_I_I_|
|-------|------|-------------------|------|-------|
||-| |-|| |-| ||-| |-| |-| |-| |-|| |-| ||-| |-||
((|-------|------|-------------------|------|-------|))
()| |_| | |_| |::::: ------- :::::| |_| | |_| |()
))| |_| | |_| | |_| |_.-"-._| |_| | |_| | |_| |((
()|-------|------| |_| | | | | | |_| |------|-------|()
@@@@@@@@@@@@@@@@@|-----|_|_|_|_|-----|@@@@@@@@@@@@@@@@@
@@@@/=============\@@@@
/ \
Source: www.asciiworld.com
Run-length Encoding
Run-length encoding (RLE) is one of the simplest form of lossless data compression. A sequence of data in
which the same data value occurs in many consecutive data elements are stored as a single pair of data value
and count instead of the original form. It is quite effective in compressing data that contains frequent
repetitions and repeated patterns, for example, the ACSII art described above.
2
PROBLEM DEFINITION
In this assignment, you are required to write some classes to encode ASCII art and decode the compressed
ASCII art file.
Task 1: Encoding a text file containing a picture of ASCII art (50%)
To encode the file, you need to open and read the text file line-by-line. Then you need to figure out whether
there are any occurrences of consecutive elements on each single line.
Example 1:
The original line
@ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ | - - - - - | _ | _ | _ | _ | - - - - - | @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @
The encoded line
1 7 @ 1 | 5 – 1 | _ | _ | _ | _ | 5 – 1 | 1 7 @
Syntax of the encoded line
<positive integer> <ASCII character(s)> <positive integer> <ASCII character(s)> <positive integer> <ASCII
character(s)> … repeats until the end of line.
The encoded lines are written to a new text file for output. The positive integer denotes the number of times
that the succeeding character(s) is repeated in the original line. There is a space between the positive integer
and the character pattern. The line continues until the end of the input line is reached.
If the positive integer equals 1, the length of the succeeding ASCII character pattern can be larger than
or equal to 1. If the positive integer is greater than 1, the length of the succeeding pattern should be 1 in
this task.
As spaces are used as the delimiter in the encoded file, we here use negative integers to encode spaces in the
original file. Whenever you encounter spaces in the original line, you are required to write a negative integer
in the encoded file. For N spaces, the integer written is -N.
Example 2:
The original line
. ( ( ( ( ( ( ) ) ) ) )
The encoded line
- 1 0 1 . - 1 1 6 ( 5 )
Syntax of the encoded line
<negative integer> <positive integer> <ASCII character(s)> <negative integer> <positive integer> <ASCII
character(s)> <positive integer> <ASCII character(s)>
3
There are 10 spaces at the beginning of the original line. Then, a '.' appears. After that, we have 11 spaces. It
is then followed by 6 '('s and 5 ')'s.
The suggested algorithm:
1. Read a single line of text from the file containing a picture of ASCII art.
2. Process this line by first determining whether there is a space.
3. If there are some spaces, count the number of spaces and write a negative integer to the encoded
file.
4. If it is not a space, determine whether there is a repetition of a single character.
5. If repetition exists, write a positive integer to represent the number of the repeated character.
Also write the repeated character to the file.
6. If there is no repetition, write 1 and extract the character pattern up to a position where a space
is present OR repetition of a single character starts to occur.
7. Repeat Step 2 to process the remaining characters in the same line.
8. The encoding process ends when all lines in the original file are encoded in the output file.
For an original file that contains M lines of texts, there must also be M lines of texts in the corresponding
encoded file.
Task 2: Decoding the compressed ASCII art file (40%)
In this task, you are asked to reverse the above process. Given the run-length encoded file, you are required
to decode it to the original ASCII picture.
You should also open and read the encoded file line-by-line. For each line, you need to scan for integers and
strings in this single line so that you can write an appropriate number of the repeated characters to the
decoded ASCII picture file. Once a negative integer is read, you should convert it to the correct number of
spaces in the decoded file.
The suggested algorithm:
1. Read a line of characters from the encoded ASCII art file.
2. Scan this line for an integer.
3. If it is negative, write a corresponding number of spaces in the outputting line of the decoded
file.
4. If it is positive, scan for a string that follows in the input line. Write an appropriate number of
this character pattern in the decoded file.
5. Repeat Step 2 until the end of line is reached.
6. The decoding process ends when all lines in the encoded file are read.
Task 3: Increasing the efficiency of the encoded file (10%)
In task 1, we only consider the repetition of a single character in the original line. We can actually look for
repeated string patterns to increase the efficiency in terms of storage space.
4
Example 3:
The original line
@ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ | - - - - - | _ | _ | _ | _ | - - - - - | @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @
The encoded line
1 7 @ 1 | 5 – 4 | _ 1 | 5 – 1 | 1 7 @
If we make use of the pattern “|_” in the original line, the encoded line becomes shorter than its counterpart
as shown in Example 1.
Example 4:
The original line
I _ I _ I _ I _ I _ I _ I _ I | I _ I _ I _ I _ I _ I _ I _ I _ I _ I | I _ I _ I _ I _ I _ I _ I _ |
The encoded line
- 2 7 I _ 1 I | 9 I _ 1 I | 7 I _ 1 |
Indeed, the repeated string pattern may be longer than 2 characters. The longer the pattern that you are able to
take into account, the smaller is the size of the encoded file.
Please note that a space in the original file breaks the pattern. It is because the <ASCII character(s)> pattern
cannot contain any spaces in the encoded file. As we are writing our encoded file in text format and making it
to be able to decode using a simple algorithm, the resulting file may have a size larger than the original one.
Example 5:
The original line
| | @ | | @ | | @ | | @ | | @ | |
The encoded line
2 | 1 @ | - 1 1 | @ | - 1 1 | @ | - 1 1 | @ | - 1 1 | @ 2 |
In this task, your marks are given with reference to the size of your encoded file. The smaller the file size, the
higher is your mark. To make sure your encoding scheme is correct, you should verify the resulting file
by decoding it once with the method(s) written in Task 2.
For all the tasks above, you can assume that the dimension of the original ASCII art picture is smaller
than 500-by-500 characters., including newline characters.
5
PROCEDURE
1. Create a new project named Assignment4 in folder Assignment4. There are four source files named
Assignment4.java, RunLengthEncoder.java, RunLengthDecoder.java and RunLengthEncoderAdvanced.java
that contains the classes Assignment4, RunLengthEncoder (for task 1), RunLengthDecoder (for
task 2) and RunLengthEncoderAdvanced (for task 3). You shall define them in one package named
assignment4.
2. In the main method of class Assignment4, you are required to read a file name from the standard input
by users as follows:
The original ASCII art picture file: testcase1
If the input name is “testcase1” as shown above, the program should read the ASCII art file “testcase1.txt”.
Then, your program generates the following four files:
(1) “testcase1_e.txt”: the encoding results from Tasks 1;
(2) “testcase1_d.txt”: the results from decoding (1);
(3) “testcase1_ae.txt”: the encoding results from Tasks 3;
(4) “testcase1_ad.txt”: the results from decoding (3).
In the main method, you also need to make use of the other classes defined to perform encoding and
decoding.
3. If you have completed writing the classes, try build the project (press the function key [F11] on the
keyboard). If there are errors, don’t panic. Double-click on the first error message in the Output window.
Check the error, correct it and re-compile. Feel tired? Take a rest.
4. If you have many opened projects, close others or click menu [Run] [Set Main Project].
5. You may insert println() statements in your work to inspect variables and intermediate results.
6. When you finish and there is no more error, you are ready to try out the program by pressing the function
key [F6] on the keyboard. Then you can type the input number in the standard input. Enjoy your work.
SUBMISSION
1. Locate your NetBeans project folder, e.g. H:\Assignment4\.
2. ZIP the project folder Assignment4 and Submit the file Assignment4.zip via our Online Assignment
Collection Box on Blackboard https://blackboard.cuhk.edu.hk
6
MARKING SCHEME AND NOTES
1. The submitted program should be free of any typing mistakes, compilation errors and warnings.
2. Comment/remark, indentation, style is under assessment in every programming assignments unless
specified otherwise. Variable naming, proper indentation for code blocks and adequate comments are
important. Insert your name, SID, section, date as well as a declaration statement on academic honesty in a
header comment block in the source file.
3. Test your work using different sets of inputs.
4. For Task 3, the smaller the size of the encoded file, the higher is the mark that you can get.
5. Remember to do your submission before 6:00 p.m. of the due date. No late submission would be accepted.
6. If you submit multiple times, ONLY the content and time-stamp of the latest one would be counted. You
may delete (i.e. take back) your attached file and re-submit. We ONLY take into account the last submission.
UNIVERSITY GUIDELINE FOR PLAGIARISM
Attention is drawn to University policy and regulations on honesty in academic work, and to the disciplinary
guidelines and procedures applicable to breaches of such policy and regulations. Details may be found at
http://www.cuhk.edu.hk/policy/academichonesty/. With each assignment, students are required to submit
a statement that they are aware of these policies, regulations, guidelines and procedures, in a header
comment block.
FACULTY OF ENGINEERING GUIDELINES TO ACADEMIC HONESTY
MUST read: https://www.erg.cuhk.edu.hk/erg/AcademicHonesty
(you may need to access via CUHK campus network/ CUHK1x/ CUHK VPN)
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。