联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2023-10-17 09:47

Page 1 of 11

CS 341 : Programming Languages

Fall 2023

Project 02 : Image processing in F# (doc v1.0)

Assignment: F# function library to perform image operations

Evaluation: Gradescope followed by manual execution & review

Policy: Individual work only

Complete By: Saturday, October 28th @ 11:59pm CDT

Late submissions: Not Allowed

Background

You are going to write a program to perform various operations on images stored in PPM format, such as

this lovely image of a piece of cake:

There are many image formats you are no doubt familiar with: JPG, PNG, etc. The advantage of PPM is that the

file format is human-readable, so you can open PPM files in a text editor. This makes it easier to write programs

that manipulate the images, and it also makes it easier to debug your output — you can simply open the image

file in a text editor and “see” what’s wrong. First some background on PPM files, and then the details of the

assignment…

PPM Image Format

The PPM (or Portable Pix Map) image format is encoded in human-readable ASCII text. For those of you

who enjoy reading documentation, the formal image specification can be found here 1

. Here is a sample ppm

file, representing a very small 4x4 image:

1 http://netpbm.sourceforge.net/doc/ppm.html

Page 2 of 11

P3

4 4

255

0 0 0 100 0 0 0 0 0 255 0 255

0 0 0 0 255 175 0 0 0 0 0 0

0 0 0 0 0 0 0 15 175 0 0 0

255 0 255 0 0 0 0 0 0 255 255 255


Here is what this image looks like, magnified 5,000%. Notice it consists of 16 pixels, laid out in 4 rows with 4

pixels per row:

You can think of an image as having two parts, a header and a body. The header consists of information about

the image such as width and height, GPS location, time, date, etc.. For PPM images, the header is very simple,

and has only 4 entries:

P3

4 4

255


P3 is a "magic number". It indicates what type of PPM image this is (full color, ASCII encoding). For this

assignment it will always be the string “P3”. The next two values, 4 4, represent the width and height of the

image — or more accurately from a programming perspective, the number of pixels in one row is the width

and the number of rows in the image is the height. The final value, 255, is the maximum color depth for the

image. For images in “P3” format, the depth can be any value in the range 0..255, inclusive.

The image body contains the pixel data — i.e. the color of each pixel in the image. For the image shown

above, which is a 4x4 image, we have 4 rows of pixel data:

0 0 0 100 0 0 0 0 0 255 0 255

0 0 0 0 255 175 0 0 0 0 0 0

0 0 0 0 0 0 0 15 175 0 0 0

255 0 255 0 0 0 0 0 0 255 255 255

Page 3 of 11

Look at this data closely… First, notice the values range from 0 .. maximum color depth (in this case 255).

Second, notice that each row contains exactly 12 values, with at least one space between each value. Why

12? Because each row contains 4 pixels, but each pixel in PPM format consists of 3 values: the amount of

RED, the amount of GREEN, and the amount of BLUE. This is more commonly known as the pixel’s RGB value.

Black, the absence of color, has an RGB value of 0 0 0 — the minimum amount of each color. White, the

presence of all colors, has an RGB value of depth depth depth — the maximum amount of each color. As

shown above, notice the first pixel in the 4x4 image is black, and the last pixel is white.

In general a pixel’s RGB value is the mix of red, green, and blue needed to make that color. For example,

here are some common RGB values, assuming a maximum color depth of 255:

Yellow: 255 255 0

Maroon: 128 0 0

Navy Blue: 0 0 128

Purple: 128 0 128

You can read more about RGB on the web2

. We will provide you with 5 PPM images to work with. The image

shown above is “tiny4by4.ppm”. The images will be made available in the replit repository “Project 02” for

this project.

● blocks.ppm

● cake.ppm

● square.ppm

● tiny4by4.ppm

● tinyred4by4.ppm

Viewing PPM Images

PPM files are an uncommon file format, so if you double-click on a “.ppm” image file you will be unable to

view it. Here are some options for viewing PPM files, which you’ll need to do for testing purposes…

Option #1 is to download a simple JavaScript program for viewing images on your local computer:

ppmReader.zip . Download, double-click to open, and extract the file ppmReader.html --- save this anywhere

on your local computer. When you want to view a PPM image file, download the PPM file to your local

computer, double-click on ppmReader.html, and select the PPM image file for viewing. This tool is also

available online.

Option #2: on Windows you can view PPM images using Irfanview 3

, a free image utility. If the installation

fails, note that I had to download the installer and run as administrator for it to install properly on Windows:

right-click on the installer program and select “run as administrator”.

Option #3: you can “view” PPM files in your local text editor: File menu, Open command, and then browse

2 http://www.rapidtables.com/web/color/RGB_Color.htm

3 http://www.irfanview.com/

Page 4 of 11

to the file and open it — you will see lots of integers :-) If you don’t see the PPM files listed in the Open File

dialog, try selecting “All files” from the drop-down.

Assignment

A replit.com project “Project 02” has been created with an initial program template (written in C# and F#),

along with PPM image files for testing. The provided program implements a simple console-based interface for

performing one of five image processing functions:

In the screenshot shown above, the user has entered the name of the cake PPM image file “cake.ppm”, and

has selected operation #1 to perform a grayscale conversion. The resulting image is written to the file “cakegrayscale.ppm”.

The console-based interface is written in C#, and can be found in the file “Program.cs”; there is no reason

to modify this file (except for debugging purposes). Your assignment is to implement the five image processing

functions found in the F# file “Library.fs”: Grayscale, Threshold, FlipHorizontal, EdgeDetect, and

RightRotate90. Here’s the contents of that file. Do not change the API in any way: do not add parameters, do

not change their types, etc. We will be grading your F# library against our own test suite, and so your API

must match what is given on the next page:

// Converts the image into grayscale and returns the

// resulting image as a list of lists. Pixels in grayscale

// have the same value for each of the Red, Green and Blue

// values in the RGB value. Conversion to grayscale is done

// by using a WEIGHTED AVERAGE calculation. A normal average

// (adding the three values and dividing by 3) is NOT the best,

// since the human eye does not perceive the brightness of

// red, green and blue the same. The human eye perceives

// green as brighter than red and it perceived red as brighter

// than blue. Research has shown that the following weighted

// values should be used when calculating grayscale.

// - the green value should account for 58.7% of the grayscale amount.

// - the red value should account for 29.9% of the grayscale amount.

// - the blue value should account for 11.4% of the grayscale amount.

//

// So if the RGB values were (25, 75, 250), the grayscale amount

// would be 80, (25 * 0.299 + 75 * 0.587 + 250 * 0.114 => 80)

// and then all three RGB values would become 80 or (80, 80, 80).

// We will use truncation to cast from the floating point result

// to the integer grayscale value.

//

// Returns: updated image.

//

let rec Grayscale (width:int)

(height:int)

(depth:int)

(image:(int*int*int) list list) =

// for now, just return the image back, i.e. do nothing:

image

//

// Threshold

//

// Thresholding increases image separation --- dark values

// become darker and light values become lighter. Given a

// threshold value in the range 0 < threshold < color depth,

// each RGB value is compared to see if it's > threshold.

// If so, that RGB value is replaced by the color depth;

// if not, that RGB value is replaced with 0.

//

// Example: if threshold is 100 and depth is 255, then given

// a pixel (80, 120, 160), the new pixel is (0, 255, 255).

//

// Returns: updated image.

//

let rec Threshold (width:int)

(height:int)

(depth:int)

(image:(int*int*int) list list)

(threshold:int) =

// for now, just return the image back, i.e. do nothing:

Page 6 of 11

image

//

// FlipHorizontal:

//

// Flips an image so that what’s on the left is now on

// the right, and what’s on the right is now on the left.

// That is, the pixel that is on the far left end of the

// row ends up on the far right of the row, and the pixel

// on the far right ends up on the far left. This is

// repeated as you move inwards toward the row's center.

//

// Returns: updated image.

//

let rec FlipHorizontal (width:int)

(height:int)

(depth:int)

(image:(int*int*int) list list) =

// for now, just return the image back, i.e. do nothing:

image

//

//

// Edge Detection:

//

// Edge detection is an algorithm used in computer vision to help

// distinguish different objects in a picture or to distinguish an

// object in the foreground of the picture from the background.

//

// Edge Detection replaces each pixel in the original image with

// a black pixel, (0, 0, 0), if the original pixel contains an

// "edge" in the original image. If the original pixel does not

// contain an edge, the pixel is replaced with a white pixel

// (255, 255, 255).

//

// An edge occurs when the color of pixel is "signigicantly different"

// when compared to the color of two of its neighboring pixels.

// We only compares each pixel in the image with the

// pixel immediately to the right of it and with the pixel

// immediately below it. If either pixel has a color difference

// greater than a given threshold, then it is "significantly

// different" and an edge occurs. Note that the right-most column

// of pixels and the bottom-most column of pixels can not perform

// this calculation so the final image contain one less column

// and one less row than the original image.

//

// To calculate the "color difference" between two pixels, we

// treat the each pixel as a point on a 3-dimensional grid and

// we calculate the distance between the two points using the

// 3-dimensional extension to the Pythagorean Theorem.

// Distance between (x1, y1, z1) and (x2, y2, z2) is

// sqrt ( (x1-x2)^2 + (y1-y2)^2 + (z1-z2)^2 )

//

// The threshold amount will need to be given, which is an

// integer 0 < threshold < 255. If the color distance between

// the original pixel either of the two neighboring pixels

Page 7 of 11

// is greater than the threshold amount, an edge occurs and

// a black pixel is put in the resulting image at the location

// of the original pixel.

//

// Returns: updated image.

//

let rec EdgeDetect (width:int)

(height:int)

(depth:int)

(image:(int*int*int) list list)

(threshold:int) =

// for now, just return the image back, i.e. do nothing:

image

//

// RotateRight90:

//

// Rotates the image to the right 90 degrees.

//

// Returns: updated image.

//

let rec RotateRight90 (width:int)

(height:int)

(depth:int)

(image:(int*int*int) list list) =

// for now, just return the image back, i.e. do nothing:

image

Assignment details

The parameters to the F# functions should be self-explanatory. For example, the RightRotate90 function

takes the image width, height, depth, and the image data. The image data is the interesting one… The format

is a list of lists of tuples, where each tuple denotes the RGB values for that pixel. For example, on page 2 we

presented the file format for the “tiny4by4.ppm” image:

P3

4 4

255

0 0 0 100 0 0 0 0 0 255 0 255

0 0 0 0 255 175 0 0 0 0 0 0

0 0 0 0 0 0 0 15 175 0 0 0

255 0 255 0 0 0 0 0 0 255 255 255

The image data passed to the F# image functions in this case is a list of 4 lists, one sub-list per row. Each row

is a sub-list of tuples (R, G, B), where R, G, and B are color values 0..depth (in this case 0..255):

[ [ (0,0,0); (100,0,0); (0,0,0); (255,0,255) ] ;

[ (0,0,0); (0,255,175); (0,0,0); (0,0,0) ] ;

[ (0,0,0); (0,0,0); (0,15,175); (0,0,0) ] ;

[ (255,0,255); (0,0,0); (0,0,0); (255,255,255) ] ]

Page 8 of 11

Note each sub-list contains the same number of pixels ― 4 in this case. You must work with this format for

communication between the console-based front-end and the F# back-end; you cannot change the data

structure.

Here is more information about the functions you need to write. Use whatever features you want in F#,

except functions that perform direct image manipulations. Strive for efficient solutions using the techniques

we have discussed in class; any function that takes over a minute to execute is considered unacceptable and

thus wrong. And of course, do *not* use imperative style programming: no mutable variables, no arrays, and

no loops.

1. Grayscale (width:int) (height:int) (depth:int) (image:(int*int*int) list list):

This function converts the image into grayscale and returns the resulting image as a list of lists.

Conversion to grayscale is done by calculating a weighted average of the RGB values for a pixel, and

then replacing them all by that average. The weights used for this calculation are 29.9% for red, 58.7%

for green and 11.4% for blue. This is because the human eye perceives green as brighter than red and

it perceives red as brighter than blue. So if the RGB values were (25, 75, 250), the average would be

80. Since (25 * 0.299) + (75 * 0.587) + (250 * 0.114) is 80. Then all three RGB values would become 80

— i.e. (80, 80, 80). Here’s the cake in gray:

2. Threshold (width:int) (height:int) (depth:int) (image:(int*int*int) list list) (threshold:int):

Thresholding increases image separation --- dark values become darker and light values become

lighter. Given a threshold value in the range 0 < threshold < color depth, any RGB value > threshold is

set to the color depth (e.g. 255), while any RGB value <= threshold is set to 0. Example: assuming a

threshold of 100 and a depth of 255, the pixel (80, 120, 160) is transformed to the pixel (0, 255, 255).

Given a grayscale image, after thresholding the image becomes black and white. Another example:

given the cake image, the left is the result with a threshold value of 50, and the right with a threshold

value of 200:

Page 9 of 11

3. FlipHorizontal (width:int) (height:int) (depth:int) (image:(int*int*int) list list):

This function flips an image so that what’s on

the left is now on the right, and what’s on the

right is now on the left. That is, the pixel that is

on the far left end of the row ends up on the far

right of the row, and the pixel on the far right

ends up on the far left. This is repeated as you

move inwards toward the center of the row;

remember to preserve RGB order for each pixel

as you flip ― you flip pixels, not individual RGB

colors. Here’s the cake flipped horizontally:

4. EdgeDetect (width:int) (height:int) (depth:int) (image:(int*int*int) list list) (threshold:int):

Edge detection is an algorithm used in computer vision to help distinguish different objects in a picture

or to distinguish an object in the foreground of the picture from the background. Edge Detection

replaces each pixel in the original image with a black pixel, (0, 0, 0), if the original pixel contains an

"edge" in the original image. If the original pixel does not contain an edge, the pixel is replaced with a

white pixel (255, 255, 255).

An edge occurs when the color of pixel is "significantly different" when compared to the color of two of

its neighboring pixels. We only compare each pixel in the image with the pixel immediately to the right

of it and with the pixel immediately below it. If either pixel has a color difference greater than a given

threshold, then it is "significantly different" and an edge occurs. Note that the right-most column of

pixels and the bottom-most row of pixels cannot perform this calculation so the final image contains

one less column and one less row than the original image.

To calculate the "color difference" between two pixels, we treat each pixel as a point on a 3-

dimensional grid and we calculate the distance between the two points using the 3-dimensional

Page 10 of 11

extension to the Pythagorean Theorem. The distance between (x1, y1, z1) and (x2, y2, z2) is:

√(𝑥1 − 𝑥2)

2 + (𝑦1 − 𝑦2)

2 + (𝑧1 − 𝑧2)

2

The threshold amount will need to be given, which is an

integer 0 < threshold < 255. If the color distance between

the original pixel either of the two neighboring pixels is

greater than the threshold amount, an edge occurs, and a

black pixel is put in the resulting image at the location of

the original pixel. Here’s is the cake with edge detection

with threshold of 50:

5. RotateRight90 (width:int) (height:int) (depth:int) (image:(int*int*int) list list):

This function rotates the image to the right 90

degrees. The image returned by this function will have

different dimensions than the original image passed

in. Here’s the cake rotated right 90 degrees:

Page 11 of 11

Hints

● FlipHorizontal is probably the easiest and thus the first function to attempt. The built-in List.rev function

that reverses a list makes this one pretty simple. So, use that one to understand working with a list of lists.

No need to modify the tuple for the color values at each pixel.

● RotateRight90 also does not have you modify the tuple for the color values. A built-in function makes this

one easy to complete as well. However, determining your own solution will help for some of the other

functions.

● Both Grayscale and Threshold require you to access and change the color values in the tuple for each pixel.

● EdgeDetect is the one that is the hardest. The resulting color values at each pixel relies on accessing the

color values at two other pixels.

Electronic Submission to Gradescope Assignment “Project 02”

Before you submit, add a header comment in “Library.fs” with your name, NetID, date, project name,

project description and more details about the library itself.

When you are ready to submit your program for grading, login to Gradescope and upload your “Library.fs”

file. We will test your library functions against our own version of the console-based front-end. You have

unlimited submissions, and Gradescope keeps a complete history of all submissions you have made. By

default, Gradescope records the score of your last submission, but if that score is lower, you can click on

“Submission history”, select an earlier score, and click “Activate” to select it. The activated submission will be

the score that gets recorded, and the submission we grade. If you submit on-time and late, we’ll grade the last

submission (the late one) unless you activate an earlier submission.

The grade reported by the Gradescope autograder will be a tentative one. After the due date, submissions

will be manually reviewed to ensure project requirements have been met. Failure to meet a requirement ---

e.g. use of mutable variables or loops --- will generally fail the submission with a project score of 0.

This far into the course, it goes without saying that style will also be a component of the final score. At the

very least, we expect a header comment, function header comments (especially for custom helper functions),

consistent spacing between functions, and comments wherever a complex block of code requires explanation.

Policy

Unless stated otherwise, all work submitted for grading *must* be done individually. While we encourage you to

talk to your peers and learn from them, this interaction must be superficial with regards to all work submitted for

grading. This means you *cannot* work in teams, you cannot work side-by-side, you cannot submit someone else’s work

(partial or complete) as your own. The University’s policy is available here:

https://dos.uic.edu/conductforstudents.shtml .

In particular, note that you are guilty of academic dishonesty if you extend or receive any kind of unauthorized

assistance. Absolutely no transfer of program code between students is permitted (paper or electronic), and you may

not solicit code from family, friends, or online forums. Other examples of academic dishonesty include emailing your

Page 12 of 11

program to another student, copying-pasting code from the internet, working in a group on a homework assignment,

and allowing a tutor, TA, or another individual to write an answer for you. Academic dishonesty is unacceptable, and

penalties range from a letter grade drop to expulsion from the university; cases are handled via the official student

conduct process described at https://dos.uic.edu/conductforstudents.shtml .


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp