Part 1: Fork and Set Up the Repository
1. Please fork the GitHub repository located at https://github.com/CambioML/uniflow
2. Clone your forked repository to your local machine.
3. When you finish this interview question, share you forked GitHub repo for review.
2
Part 2: Setup dev environment and understand basic flow
1. You need to setup your dev environment
a. create a conda dev environment and activate it.
b. install poetry through pip install poetry
c. cd into uniflow root folder and install all uniflow dependency through poetry
install --no-root . This will install all dependencies under in pyproject.toml.
2. understand the basic copy flow which is doing nothing but copy the input as output
using a copy_op.
3. each Op for example copy_op takes one or more nodes as input and output one or
more nodes. Each node hold has a value_dict to hold the value.
4. Before exit a flow, value_dict is extracted from the node and return as flow output
with root node for viz.
5. Flow client interface is to execute the copy flow in this example notebook. For this
basic flow , it inputs and outputs as like below shown.
Input:
[{"a": 1, "b": 2}, {"c": 3, "d": 4}, {"e": 5, "f": 6}, {"g": 7,
Output:
[{'output': [{'a': 1, 'b': 2}],
'root': <uniflow.node.node.Node at 0x140177fa0>},
{'output': [{'c': 3, 'd': 4}],
'root': <uniflow.node.node.Node at 0x140174310>},
{'output': [{'e': 5, 'f': 6}],
'root': <uniflow.node.node.Node at 0x1401772e0>},
{'output': [{'g': 7, 'h': 8}],
'root': <uniflow.node.node.Node at 0x1401e2c20>}]
Uniflow behind the scene build and execute a computational DAG as below. You can
use the should be able to access the root node from output and viz the computational
graph as below in this example notebook as well.
If you can run each cell of this example notebook locally, you setup is good and you can
continue.
Part 3: Implement a ExpandReduceFlow and corresponding
ExpandOp and ReduceOp
1. Create a file named expand_op.py and implement a class named ExpandOp that
inherits from the Op class within this directory. The ExpandOp class should accept a
root node as input and produce two nodes as output. If the root node has a
value_dict with n elements, the first output node, expand_1 , should contain the first
n//2 key-value pairs from value_dict, and the second output node, expand_2 , should
contain the remaining key-value pairs from indices n//2 to n in its value_dict.
ExpandOp class constructor should also take a function argument to configure for
other options to split root into expand_1 and expand_2.
root node:
value_dict: {"1": "2", "3": "4", "5": "6", "7": "8"}
expand_1 node:
value_dict: {"1": "2", "3": "4"}
expand_2 node:
value_dict: {"5": "6", "7": "8"}
2. Create a file named reduce_op.py and implement a class called ReduceOp that
extends the Op class in this directory. The ReduceOp class should take expand_1 and
expand_2 as inputs and produce a single output node, reduce_1 . Following the
invocation of ReduceOp, the value_dict of the reduce_1 node should be a merged
version of the key-value pairs from both expand_1 and expand_2 . Additionally, the
constructor of the ReduceOp class should accept a function as a parameter,
allowing for alternative strategies to combine the value_dict of expand_1 and
expand_2 into reduce_1 .
expand_1 node:
value_dict: {"1": "2", "3": "4"}
expand_2 node:
value_dict: {"5": "6", "7": "8"}
reduce_1 node:
value_dict: {"1 5": "2 6", "3 7": "4 8"}
3. Create a file named expand_reduce_flow.py and implement ExpandReduceFlow class
inherit Flow base class. ExpandReduceFlow should use both newly implemented
ExpandOp and ReduceOp .
4. Register your newly ExpandReduceFlow into flow_dict in this constants.py for client to
invoke.
4. Now, run this notebook to execute your ExpandReduceFlow and make sure everything
working as expected.
5
5. Write one unit test for demo purpose for expand_op.py, reduce_op.py, or
expand_reduce_flow.py
6. Your code should also be able to handle ALL POSSIBLE EDGE CASE without
crashing!!!
Part 4: Dockerize the Application
1. Create a Dockerfile to containerize your new Uniflow application with latest change.
Ensure that all the necessary dependencies are installed.
2. Write instructions on how to build and run the Docker container.
Part 5: Deploy on Kubernetes
1. Create a Kubernetes Deployment YAML file to deploy the latest Uniflow application.
2. Include necessary Kubernetes resources like Service, ConfigMap, or Secret if
needed.
3. Write instructions on how to deploy the application on a Kubernetes cluster.
Part 6: Integrate Local Database
1. Select a local database system, such as SQLite or PostgreSQL, and incorporate it
into the Uniflow application.
2. Following the execution of ExpandReduceFlow , ensure to store the resulting output keyvalue pair in the database. Your implementation must be devoid of errors,
particularly concerning potential race conditions, as Uniflow operates in a multithreaded environment.
Part 7: Develop Client-Facing APIs
7.1 Asynchronous RESTful Invocation of ExpandReduceFlow
1. Create an API endpoint that allows for the ExpandReduceFlow function to be initiated
asynchronously.
2. Upon calling this API, a unique identifier for the job should be provided for
subsequent tracking of its progress.
3. Add necessary extra database if you needed.
7.2 Synchronous RESTful Endpoint to Verify Async Call Status
1. Establish an API endpoint that facilitates the checking of the current status of the
ExpandReduceFlow asynchronous execution, utilizing the previously provided job ID.
2. The API needs to respond with the current status of the job (options could include
states like "pending" or "completed").
7.3 Synchronous RESTful Endpoint to Retrieve All key-value Pairs
1. Construct an API endpoint that is capable of retrieving all key-value pairs resulting
from the ExpandReduceFlow process, directly from the database.
2. The API should be equipped to manage pagination, ensuring smooth functionality
even when dealing with a substantial number of key-value pairs.
Part 8: Documentation
1. Document the setup and installation process for the entire application, including
Docker and Kubernetes here.
2. Provide API documentation, including endpoints, request/response format, and
examples.
Evaluation Criteria
Code Quality: Assess the readability, consistency, and simplicity of the code.
Problem Solving: Evaluate the candidate’s ability to solve problems and implement
solutions.
Knowledge: Assess the candidate’s understanding of Docker, Kubernetes,
databases, and API design.
Documentation: Evaluate the clarity and completeness of the provided
documentation.
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。