2020/4/20 How to Design Data: CPSC 103 201/202 Introduction to Systematic Program Design
How to Design Data
A data definition establishes the relationship between information and data:
Information in the program's domain is represented by data in the program.
Data in the program can be interpreted as information in the program's domain.
The representation of information in your program drives every other stage of design, as you'll see
here and in the data driven templates. A data definition must describe how to form (or make) data
that satisfies the data definition and also how to tell whether a data value satisfies the data definition.
It must also describe how to represent information in the program's domain as data and interpret a
data value as information.
So, for example, one data definition might say that numbers are used to represent the speed of a
ball. Another data definition might say that numbers are used to represent the height of an airplane.
So given a number like 6, we need a data definition to tell us how to interpret it; is it a speed, or
a height, or something else entirely? Without a data definition, the 6 could mean anything.
The first step of the recipe is to identify the inherent structure of the information.
Once that is done, a data definition consists of four elements:
1. A data type definition with type comments where Python's types are not specific enough.
2. An interpretation comment that describes the correspondence between information and
data.
3. One or more examples of the data.
4. A template for a one-argument function operating on data of this type.
WHAT IS THE INHERENT STRUCTURE OF THE INFORMATION?
One of the most important points in the course is that:
the structure of the information in the program's domain determines the kind of data definition
used,
which in turn determines the structure of the templates and helps determine the function
examples (expects),
and therefore the structure of much of the final program design.
2020/4/20 How to Design Data: CPSC 103 201/202 Introduction to Systematic Program Design
https://canvas.ubc.ca/courses/35980/pages/how-to-design-data?module_item_id=1472907 2/8
The remainder of this page lists in detail different kinds of data definition that are used to represent
information with different structures. The page also shows in detail how to design a data definition of
each kind. This summary table provides a quick reference to which kind of data definition to use for
different information structures.
When the form of the information to be represented...
Use a data definition of
this kind
cannot be separated into meaningful pieces (i.e., is "atomic") Simple Atomic Data
is numbers within a certain range Interval
consists of a fixed number of distinct items Enumeration
is information in one of the other forms except for one special case Optional
consists of two or more types of information that naturally belong
together (introduced in Module 4)
Compound data
is of arbitrary (unknown) size (introduced in Module 5) Arbitrary-Sized
SIMPLE ATOMIC DATA
Use simple atomic data when the information to be represented is itself atomic in form (i.e.
cannot be separated into meaningful pieces), such as the air temperature, or the x-coordinate of a
particle.
Temperature = float
# interp. the air temperature in degrees Celsius
T1 = 0.0
T2 = -24.5
@typecheck
def fn_for_temperature(t: Temperature) -> ...:
return ...(t) # template based on Atomic Non-Distinct
As noted beside the template, it is formed according to the Data Driven Templates
(https://canvas.ubc.ca/courses/35980/pages/data-driven-templates) rules using the right hand column
of the atomic non-distinct rule.
Guidance on Data Examples and Function Example/Tests
2020/4/20 How to Design Data: CPSC 103 201/202 Introduction to Systematic Program Design
https://canvas.ubc.ca/courses/35980/pages/how-to-design-data?module_item_id=1472907 3/8
One or two data examples are usually sufficient for simple atomic data.
When creating example/tests for a specific function operating on simple atomic data at least one test
case will be required. Additional tests are required if the function behaves differently depending on
the input. If the function returns bool there needs to be at least a True and False test case. Also be
on the lookout for cases where a number of some form is an interval in disguise. For example, given
a type definition like Countdown = int, in some functions 0 is likely to be a special case.
INTERVALS
Use an interval when the information to be represented is numbers within a certain range. Intervals
can be closed (e.g. float[0, 5] includes 0 and 5) or open (e.g. float(0.0, 5.0) excludes 0.0 and 5.0)
or half-open (e.g. float[0.0, 5.0) includes 0.0 but not 5.0, or float(0.0, 5.0] includes 5. 0 but not
0.0). Intervals may only be bounded on one end (e.g. float(>0) includes all numbers greater than
0).
Intervals often appear in Optionals, but can also appear alone, as in:
Time = int # in range[0, 86400)
# interp. seconds since midnight
MIDNIGHT = 0
ONE_AM = 3600
MIDDAY = 43200
@typecheck
def fn_for_time(t: Time) -> ...:
return ...(t) # template based on Atomic Non-Distinct
Forming the Template
As noted beside the template, it is formed according to the Data Driven Templates rules
(https://canvas.ubc.ca/courses/35980/pages/data-driven-templates) using the right hand column of the
atomic non-distinct rule.
Guidance on Data Examples and Function Example/Tests
For data examples, provide sufficient examples to illustrate how the type represents information. The
three data examples above are probably more than is needed in this case.
When writing tests for functions operating on intervals be sure to test closed boundaries as well as at
least one representative point from within the range. As always, be sure to include enough tests to
check all other points of variance in behaviour across the interval.
ENUMERATIONS
2020/4/20 How to Design Data: CPSC 103 201/202 Introduction to Systematic Program Design
https://canvas.ubc.ca/courses/35980/pages/how-to-design-data?module_item_id=1472907 4/8
Use an enumeration when the information to be represented consists of a fixed number of
distinct values, such as colours, letter grades, etc. Your data is then a "one of": one of the fixed list
of distinct values. Note that examples are considered redundant for enumerations as the data
type definition includes all possible values for the data.
from enum import Enum
Rock = Enum('Rock', ['ig', 'se', 'me'])
# interp. a rock is either igneous ('ig'), sedimentary ('se') or metamorphic ('me')
# examples are redundant for enumerations
@typecheck
# template based on one of (3 cases) and atomic distinct (3 times)
def fn_for_rock(r: Rock) -> ...:
if r == Rock.ig:
return ...
elif r == Rock.se:
return ...
elif r == Rock.me:
return ...
Forming the Template
As noted above the template, it is formed according to the Data Driven Templates
(https://canvas.ubc.ca/courses/35980/pages/data-driven-templates) rules using the right hand column
of the "one of" rule and repeatedly using the atomic distinct template rule.
Guidance on Data Examples and Function Example/Tests
Data examples are redundant for enumerations.
Functions operating on enumerations should have (at least) as many tests as there are cases in the
enumeration.
OPTIONALS
Use an optional when your information is well-represented by another form of data (often simple
atomic or interval) except for one special case, such as when data is missing. Your data is then a
"one of": one of the special value None or the normal form. The template is similar to that for
enumerations except that there are only two subclasses—the special case and the normal data—and
therefore we use a single if/else, with the normal data handled in the else clause. If the normal
data has its own data definition, your template will include a call to a helper template in the else
clause.
2020/4/20 How to Design Data: CPSC 103 201/202 Introduction to Systematic Program Design
https://canvas.ubc.ca/courses/35980/pages/how-to-design-data?module_item_id=1472907 5/8
Notice that where Python's int type isn't specific enough for our interval, we use a comment on that
line to give more details.
from typing import Optional # Must appear once at the top of the file when Optional is used.
Countdown = Optional[int] # in range[0, 10]
# interp. a countdown that has not started yet (None), or is counting down from 10 to 0
C0 = None
C1 = 10
C2 = 7
C3 = 0
@typecheck
# template based on One-Of (2 cases), Atomic Distinct, and Atomic Non-Distinct
def fn_for_countdown(c: Countdown) -> ...:
if c is None:
return ...
else:
return ...(c)
Forming the Template
As noted above the template, it is formed according to the Data Driven Templates
(https://canvas.ubc.ca/courses/35980/pages/data-driven-templates) rules using the one-of rule,
the atomic distinct rule and the atomic non-distinct rule in order.
Guidance on Data Examples and Function Example/Tests
As always, optionals should have enough data examples to clearly illustrate how the type represents
information.
Functions operating on optionals should have at least one test for None and one for the normal data.
If the normal data is an interval, then there should be tests at all points of variance in the interval.
COMPOUND DATA
(Introduced in Module 4)
Use compound data when two or more values naturally belong together. We'll use Python's
NamedTuple to store our compound data.
from typing import NamedTuple # Must appear once at the top of the file when NamedTuple is used.
Velocity = NamedTuple('Velocity', [('speed', float),
2020/4/20 How to Design Data: CPSC 103 201/202 Introduction to Systematic Program Design
https://canvas.ubc.ca/courses/35980/pages/how-to-design-data?module_item_id=1472907 6/8
('dir', int)]) # in range[0,359]
# interp. a velocity with its speed in m/s and direction as an angle in degrees
# with east=0 increasing counterclockwise
V1 = Velocity(9, 22)
V2 = Velocity(3.4, 180)
@typecheck
def fn_for_velocity(v: Velocity) -> ...: # template based on Compound
return ...(v.speed,
v.dir)
The template above is formed according to the Data Driven Templates
(https://canvas.ubc.ca/courses/35980/pages/data-driven-templates) rules using the compound rule.
Then for each of the fields, the result type of the field accessor is used to decide whether the field
accessor itself should be wrapped in another expression. In this case, where the result types are
primitive, no additional wrapping occurs. See cases below for examples of when the reference rule
applies.
Guidance on Data Examples and Function Example/Tests
For compound data definitions it is often useful to have numerous examples, for example to illustrate
special cases. These data examples can also be useful for writing function tests because they save
space in each test.
REFERENCES TO OTHER DATA DEFINITIONS
(Introduced in Module 5)
Some data definitions contain references to other data definitions you have defined (non-primitive
data definitions). One common case is for a compound data definition to reference other user-defined
data definitions. Or, once lists are introduced, for a list to contain elements that are described by
another data definition. In these cases the template of the first data definition should contain calls to
the second data definition's template function wherever the second data appears. For example:
## assume Velocity is as defined above
Car = NamedTuple('Car', [('vel', Velocity),
('accel', float)])
# interp. a car's velocity and acceleration (in m/s^2)
C1 = Car(V1, 9.81)
C2 = Car(V2, 0)
@typecheck
2020/4/20 How to Design Data: CPSC 103 201/202 Introduction to Systematic Program Design
https://canvas.ubc.ca/courses/35980/pages/how-to-design-data?module_item_id=1472907 7/8
def fn_for_car(c: Car) -> ...: # template based on Compound and Reference
return ...(fn_for_velocity(c.vel),
c.accel)
In this case the template is formed according to the Data Driven Templates
(https://canvas.ubc.ca/courses/35980/pages/data-driven-templates) rules by first using the compound
rule. Then, since the result type of c.vel is Velocity, the reference rule is used to wrap the field
accessor so that it becomes fn-for-velocity (c.vel). The call to c.accel is not wrapped because it
returns a primitive type.
ARBITRARY-SIZED
When the information in the program's domain is of arbitrary size, a data definition that can
store arbitrary-sized data is required. We will often make data definitions for arbitrary-sized
data without giving a specific name to our new type, which is why the data type definition line is
commented out below.
from typing import List # Must appear once at the top of the file when List is used.
# List[str]
# interp. a list of strings
L0 = []
L1 = ["orange", "hi", "truck"]
@typecheck
def fn_for_los(los: List[str]) -> ...: # template based on arbitrary-sized
# description of the accumulator
acc = ... # type: ...
for s in los:
acc = ...(s, acc)
return ...(acc)
The template above is formed according to the Data Driven Templates
(https://canvas.ubc.ca/courses/35980/pages/data-driven-templates) rules using the arbitrary-sized rule.
In some cases a list data type definition can have a reference to another type.
## assume Velocity is as defined above
# List[Velocity]
# interp. a list of velocities
2020/4/20 How to Design Data: CPSC 103 201/202 Introduction to Systematic Program Design
https://canvas.ubc.ca/courses/35980/pages/how-to-design-data?module_item_id=1472907 8/8
L0 = []
L1 = [Velocity(9, 22)]
L2 = [Velocity(9, 22), Velocity(25, 135), Velocity(9, 22)]
@typecheck
# template based on arbitrary-sized and reference rule
def fn_for_lov(lov: List[Velocity]) -> ...:
# description of the accumulator
acc = ... # type: ...
for v in lov:
acc = ...(fn_for_velocity(v), acc)
return ...(acc)
Guidance on Data Examples and Function Example/Tests
When writing data and function examples for arbitrary-sized data definitions always put the empty
example first. It's usually trivial for data examples, but many function tests don't work properly if the
empty case isn't working properly, so testing that first can help avoid being confused by a failure in
another test. Also be sure to have a test for a list that is at least two long.
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。