Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add interface for NoSQL storage #214

Draft
wants to merge 114 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
114 commits
Select commit Hold shift + click to select a range
0363899
[system] Add basic interface for allocating NoSQL storage
mcopik Jul 26, 2024
dd3384c
[aws] Implement allocation of DynamoDB tables
mcopik Jul 27, 2024
847534f
[system] Allocate NoSQL tables for the selected benchmark
mcopik Jul 27, 2024
6a62633
[system] Remove debug printouts
mcopik Jul 27, 2024
1c30372
[aws] Add first version of DynamoDB wrapper
mcopik Jul 27, 2024
ed22ea3
[azure] First implementation of wrapper for CosmosDB
mcopik Jul 27, 2024
b21cd34
[aws] Update AWS wrapper with query interface
mcopik Jul 27, 2024
e85c348
[gcp] Add wrapper for google cloud datastore
mcopik Jul 27, 2024
f28b496
[system] Add packages needed to support storage API
mcopik Jul 27, 2024
a1cd286
[aws] Formatting AWS nosql wrapper
mcopik Jul 28, 2024
9b3a284
[system] Split system packages into benchmark modules
mcopik Jul 28, 2024
3f3b440
[aws] Support sorting keys in DynamoDB
mcopik Jul 28, 2024
8ca8e24
[system] Fix incorrect variable name
mcopik Jul 28, 2024
1fdf04f
[system] Adapt cache implementation of NoSQL to support different lay…
mcopik Jul 28, 2024
2cd1bc4
[azure] Implement CosmosDB management
mcopik Jul 28, 2024
f90971f
[azure] Add management of CosmosDB accoutns
mcopik Jul 28, 2024
020fcd6
[azure] Install locally CosmosDB library to allocate containers
mcopik Jul 28, 2024
6959cd6
[azure] Add missing handler file
mcopik Jul 28, 2024
f089017
[azure] Allocate clients to databases and containers when they alread…
mcopik Jul 28, 2024
d6e14de
[system] Improved logging
mcopik Jul 28, 2024
0c7906a
[azure] Export NoSQL database information
mcopik Jul 28, 2024
8314f41
[system] Initialize storage before deploying function
mcopik Jul 28, 2024
b975d78
[azure] Help mypy recognize that variable is never None
mcopik Jul 28, 2024
b9ae6ca
[azure] Linting nosql wrapper
mcopik Jul 28, 2024
53bb33e
[azure] Export the proper database name as env
mcopik Jul 28, 2024
8d211a2
[azure] Export tables as envs
mcopik Jul 29, 2024
43680a1
[azure] Fix error of allocating the same container many times
mcopik Jul 29, 2024
cf6fc22
[azure] Avoid creating duplicated HTTP triggers
mcopik Jul 29, 2024
f49714c
[aws] Remove Decimals from DynamoDB results
mcopik Jul 29, 2024
36dd13f
[aws] Append environment variables to send Table data
mcopik Jul 29, 2024
4312ebc
[aws] Linting
mcopik Jul 29, 2024
ec62b9c
[azure] Prevent overwriting existing variables
mcopik Jul 29, 2024
ab62ec9
[azure Add sleep to avoid problems with stale values of envs
mcopik Jul 29, 2024
e36cf80
[gcp] Add Google Cloud manage image with gcloud
mcopik Jul 29, 2024
69fe546
[system] [aws] Move table management to each implementation to adjust…
mcopik Jul 29, 2024
50f8ac6
[azure] Remove mapping of tables
mcopik Jul 29, 2024
b836597
[system] Fix bug in cache processing - now storage is updated before …
mcopik Jul 29, 2024
853d083
[gcp] Add gcloud CLI handling
mcopik Jul 29, 2024
89560ca
[system] Linting
mcopik Jul 29, 2024
e8288a0
[gcp] Extend Docker images
mcopik Jul 29, 2024
4a944a0
[gcp] Update wrapper with database init
mcopik Jul 29, 2024
e511885
[gcp] Update docs with permissions
mcopik Jul 29, 2024
ce9076b
[aws] Ensure we update envs
mcopik Jul 29, 2024
8204715
[gcp] Add management of datastore instances
mcopik Jul 29, 2024
8e66ca3
[gcp] Correctly mount read-only files in CLI
mcopik Jul 29, 2024
994f107
[gcp] Update function envs to support NoSQL storage
mcopik Jul 29, 2024
22333a5
[benchmarks] Add initial config for 130.crud-api benchmark
mcopik Jul 29, 2024
820c5b4
[benchmarks] Update module configuration
mcopik Jul 29, 2024
85de381
[azure] [gcp] Fix nosql import
mcopik Jul 29, 2024
7f23fe2
[azure] Move storage connection string from request to config
mcopik Jul 29, 2024
2af6082
[system] Update benchmark input config
mcopik Jul 29, 2024
92c7ce6
[gcp] Add missing dependency
mcopik Jul 29, 2024
ed00c14
[experiments] Reorder input processing and function generation
mcopik Jul 29, 2024
12b475d
[whisk] Skip nosql
mcopik Jul 29, 2024
06ece22
[gcp] Fix mypy issues
mcopik Jul 29, 2024
7118b58
[benchmarks] Add test implementation of the CRUD benchmark
mcopik Jul 30, 2024
6c78152
[benchmarks] Work in progress on the CRUD benchmark
mcopik Jul 31, 2024
3d58e42
[storage] Adapt interface to multiple tiers of storage
mcopik Jul 31, 2024
045f077
[local] Adapt local to the new storage config
mcopik Jul 31, 2024
1c48a10
[system] First attempt to break the circular dependency between Syste…
mcopik Jul 31, 2024
b8b04f1
[local] Adapt new resource style
mcopik Jul 31, 2024
c9e2635
[system] Expand storage configuration
mcopik Jul 31, 2024
e9c954b
[system] remove dbg output
mcopik Jul 31, 2024
f4bd76f
[minio] Support data volume and explicit version
mcopik Jul 31, 2024
afa5d7e
[minio] Fix bug in setting volume path
mcopik Jul 31, 2024
a0c88ad
[scylla] Add ScyllaDB as a local NoSQL backend
mcopik Jul 31, 2024
898328f
[scylla] Add stop for the container
mcopik Jul 31, 2024
18f4692
[aws] Restructurize storage allocation
mcopik Jul 31, 2024
67fa42b
[benchmarks] Updated version of 130.crud-api
mcopik Jul 31, 2024
da0e009
[system] Add initialization of NoSQL results
mcopik Jul 31, 2024
f953768
[aws] Implement initialization of DynamoDB
mcopik Jul 31, 2024
e5d72b8
[system] Fix cache issue - we now prepare benchmark data before code …
mcopik Jul 31, 2024
dc60969
[benchmarks] Update input API
mcopik Jul 31, 2024
ef54402
[system] Additional logging
mcopik Aug 1, 2024
ff9314c
[benchmarks] Update input API
mcopik Aug 1, 2024
c6463e2
[benchmarks] Fix incorrect IDs and division by zero
mcopik Aug 1, 2024
9916028
[bechmarks] Implement final sizes for 130.crud-api
mcopik Aug 1, 2024
9dc5d03
[aws] Implement the update method for DynamoDB
mcopik Aug 2, 2024
078bdbc
[aws] Fix waiting condition when creating DynamoDB table
mcopik Aug 2, 2024
3d7d1dc
[gcp] Implement new resource class
mcopik Aug 2, 2024
51fafd4
[gcp] Implement upload function for Datastore
mcopik Aug 2, 2024
0843b00
[gcp] Add wrapper for update function
mcopik Aug 2, 2024
1a7325f
[gcp] Fix incorrect update implementation
mcopik Aug 2, 2024
23ed66b
[gcp] Fix linting issues
mcopik Aug 2, 2024
d277923
[azure] Move resources to a different file
mcopik Aug 2, 2024
ff1a696
[azure] Linting
mcopik Aug 2, 2024
1077605
[azure] Implement the missing function for CosmosDB
mcopik Aug 2, 2024
6758863
[azure] Add the new abstraction of system resources
mcopik Aug 2, 2024
b9dfc92
[azure] Replace insert with upsert
mcopik Aug 2, 2024
383a62b
[azure] Add NoSQL update
mcopik Aug 2, 2024
ecc2816
[azure] Fix allocation of database client
mcopik Aug 3, 2024
0e1e1fc
[local] Add base methods for local storage config
mcopik Aug 7, 2024
945b0fd
[local] Make storage paths consistent
mcopik Aug 7, 2024
be85773
[local] Drop unused storage config in output
mcopik Aug 7, 2024
7bb8d87
[local] Ignore Docker volumes in git
mcopik Aug 7, 2024
a5977b7
[system] Allow for multiple input files with storage definition
mcopik Aug 7, 2024
c168c0f
[local] Better error description
mcopik Aug 7, 2024
2caa564
[system] Remove debug printout
mcopik Aug 7, 2024
2e6f75a
[local] Make storage implementation cacheable and more generic by off…
mcopik Aug 7, 2024
92d610b
[local] Final standardization of volume names
mcopik Aug 7, 2024
3ac12e7
[storage] Remove unnecessary storage classes
mcopik Aug 7, 2024
a76c7d9
[system] Fix type hint
mcopik Aug 7, 2024
0a678b5
[local] Support ScyllaDB serialization
mcopik Aug 7, 2024
cc9c103
[local] Linting
mcopik Aug 7, 2024
cf66c2c
[system] Support removing functions that are no longer available
mcopik Aug 7, 2024
8db684c
[aws] Linting
mcopik Aug 7, 2024
223f368
[storage] Implement allocation of ScyllaDB tables
mcopik Aug 7, 2024
124cf79
[storage] Fix correct deployment name in output
mcopik Aug 7, 2024
ecc5cbf
[storage] Linting
mcopik Aug 7, 2024
dae8ec1
[local] Export NoSQL settings
mcopik Aug 7, 2024
1796ca8
[storage] Add default implementation of envs for NoSQL
mcopik Aug 7, 2024
e504674
[local] Add ScyllaDB wrapper
mcopik Aug 7, 2024
0642eaa
[local] Implement function update for local container
mcopik Aug 7, 2024
b293f7a
[storage] Linting
mcopik Aug 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ perf-cost*
python-venv
*cache*

minio-volume
scylladb-volume


# Byte-compiled / optimized / DLL files
__pycache__/
Expand Down
9 changes: 9 additions & 0 deletions .mypy.ini
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,15 @@ ignore_missing_imports = True
[mypy-google.cloud]
ignore_missing_imports = True

[mypy-google.cloud.logging]
ignore_missing_imports = True

[mypy-google.cloud.monitoring_v3]
ignore_missing_imports = True

[mypy-google.cloud.storage]
ignore_missing_imports = True

[mypy-google.api_core]
ignore_missing_imports = True

Expand Down
3 changes: 2 additions & 1 deletion benchmarks/100.webapps/110.dynamic-html/config.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{
"timeout": 10,
"memory": 128,
"languages": ["python", "nodejs"]
"languages": ["python", "nodejs"],
"modules": []
}
5 changes: 1 addition & 4 deletions benchmarks/100.webapps/110.dynamic-html/input.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,7 @@
'large': 100000
}

def buckets_count():
return (0, 0)

def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func):
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):
input_config = {'username': 'testname'}
input_config['random_len'] = size_generators[size]
return input_config
3 changes: 2 additions & 1 deletion benchmarks/100.webapps/120.uploader/config.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{
"timeout": 30,
"memory": 128,
"languages": ["python", "nodejs"]
"languages": ["python", "nodejs"],
"modules": ["storage"]
}
2 changes: 1 addition & 1 deletion benchmarks/100.webapps/120.uploader/input.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
def buckets_count():
return (0, 1)

def generate_input(data_dir, size, benchmarks_bucket, input_buckets, output_buckets, upload_func):
def generate_input(data_dir, size, benchmarks_bucket, input_buckets, output_buckets, upload_func, nosql_func):
input_config = {'object': {}, 'bucket': {}}
input_config['object']['url'] = url_generators[size]
input_config['bucket']['bucket'] = benchmarks_bucket
Expand Down
11 changes: 11 additions & 0 deletions benchmarks/100.webapps/130.crud-api/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"timeout": 30,
"memory": 128,
"languages": [
"python",
"nodejs"
],
"modules": [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that we will list resource 'modules' here e.g. nosql, storage, queues. How would this be adapted in the case of applications - where each function might have different requirements - such that the code that processes this json and creates said resources can do so as seamlessly as possible?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oanarosca That is an excellent question on which I'm working right now. At this moment, I think the best option would be to make the json a collection of configuration points, one general (benchmark's language), and the other with per-function config.

"nosql"
]
}
96 changes: 96 additions & 0 deletions benchmarks/100.webapps/130.crud-api/input.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
import uuid


def allocate_nosql() -> dict:
return {"shopping_cart": {"primary_key": "cart_id", "secondary_key": "product_id"}}


def generate_input(
data_dir, size, benchmarks_bucket, input_buckets, output_buckets, upload_func, nosql_func
):

input_config = {}

cart_id = str(uuid.uuid4().hex)
write_cart_id = str(uuid.uuid4().hex)

# Set initial data

nosql_func(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: perhaps this could be renamed to something more descriptive

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oanarosca Good catch, will fix! :-)

"130.crud-api",
"shopping_cart",
{"name": "Gothic Game", "price": 42, "quantity": 2},
("cart_id", cart_id),
("product_id", "game-gothic"),
)
nosql_func(
"130.crud-api",
"shopping_cart",
{"name": "Gothic 2", "price": 142, "quantity": 3},
("cart_id", cart_id),
("product_id", "game-gothic-2"),
)
nosql_func(
"130.crud-api",
"shopping_cart",
{"name": "SeBS Benchmark", "price": 1000, "quantity": 1},
("cart_id", cart_id),
("product_id", "sebs-benchmark"),
)
nosql_func(
"130.crud-api",
"shopping_cart",
{"name": "Mint Linux", "price": 0, "quantity": 5},
("cart_id", cart_id),
("product_id", "mint-linux"),
)

requests = []

if size == "test":
# retrieve a single entry
requests.append(
{
"route": "GET /cart/{id}",
"path": {"id": "game-gothic"},
"body": {
"cart": cart_id,
},
}
)
elif size == "small":
requests.append(
{
"route": "GET /cart",
"body": {
"cart": cart_id,
},
}
)
elif size == "large":
# add many new entries
for i in range(5):
requests.append(
{
"route": "PUT /cart",
"body": {
"cart": write_cart_id,
"product_id": f"new-id-{i}",
"name": f"Test Item {i}",
"price": 100 * i,
"quantity": i,
},
}
)
requests.append(
{
"route": "GET /cart",
"body": {
"cart": write_cart_id,
},
}
)

input_config["requests"] = requests

return input_config
67 changes: 67 additions & 0 deletions benchmarks/100.webapps/130.crud-api/python/function.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
from . import nosql

nosql_client = nosql.nosql.get_instance()

nosql_table_name = "shopping_cart"


def add_product(cart_id: str, product_id: str, product_name: str, price: float, quantity: int):

nosql_client.insert(
nosql_table_name,
("cart_id", cart_id),
("product_id", product_id),
{"price": price, "quantity": quantity, "name": product_name},
)


def get_products(cart_id: str, product_id: str):
return nosql_client.get(nosql_table_name, ("cart_id", cart_id), ("product_id", product_id))


def query_products(cart_id: str):

res = nosql_client.query(
nosql_table_name,
("cart_id", cart_id),
"product_id",
)

products = []
price_sum = 0
quantity_sum = 0
for product in res:

products.append(product["name"])
price_sum += product["price"]
quantity_sum += product["quantity"]

avg_price = price_sum / quantity_sum if quantity_sum > 0 else 0.0

return {"products": products, "total_cost": price_sum, "avg_price": avg_price}


def handler(event):

results = []

for request in event["requests"]:

route = request["route"]
body = request["body"]

if route == "PUT /cart":
add_product(
body["cart"], body["product_id"], body["name"], body["price"], body["quantity"]
)
res = {}
elif route == "GET /cart/{id}":
res = get_products(body["cart"], request["path"]["id"])
elif route == "GET /cart":
res = query_products(body["cart"])
else:
raise RuntimeError(f"Unknown request route: {route}")

results.append(res)

return {"result": results}
3 changes: 2 additions & 1 deletion benchmarks/200.multimedia/210.thumbnailer/config.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{
"timeout": 60,
"memory": 256,
"languages": ["python", "nodejs"]
"languages": ["python", "nodejs"],
"modules": ["storage"]
}
2 changes: 1 addition & 1 deletion benchmarks/200.multimedia/210.thumbnailer/input.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ def buckets_count():
:param output_buckets:
:param upload_func: upload function taking three params(bucket_idx, key, filepath)
'''
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func):
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):

for file in glob.glob(os.path.join(data_dir, '*.jpg')):
img = os.path.relpath(file, data_dir)
Expand Down
3 changes: 2 additions & 1 deletion benchmarks/200.multimedia/220.video-processing/config.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{
"timeout": 60,
"memory": 512,
"languages": ["python"]
"languages": ["python"],
"modules": ["storage"]
}
2 changes: 1 addition & 1 deletion benchmarks/200.multimedia/220.video-processing/input.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ def buckets_count():
:param output_buckets:
:param upload_func: upload function taking three params(bucket_idx, key, filepath)
'''
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func):
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):
for file in glob.glob(os.path.join(data_dir, '*.mp4')):
img = os.path.relpath(file, data_dir)
upload_func(0, img, file)
Expand Down
3 changes: 2 additions & 1 deletion benchmarks/300.utilities/311.compression/config.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{
"timeout": 60,
"memory": 256,
"languages": ["python", "nodejs"]
"languages": ["python", "nodejs"],
"modules": ["storage"]
}
2 changes: 1 addition & 1 deletion benchmarks/300.utilities/311.compression/input.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ def upload_files(data_root, data_dir, upload_func):
:param output_buckets:
:param upload_func: upload function taking three params(bucket_idx, key, filepath)
'''
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func):
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):

# upload different datasets
datasets = []
Expand Down
3 changes: 2 additions & 1 deletion benchmarks/400.inference/411.image-recognition/config.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{
"timeout": 60,
"memory": 512,
"languages": ["python"]
"languages": ["python"],
"modules": ["storage"]
}
2 changes: 1 addition & 1 deletion benchmarks/400.inference/411.image-recognition/input.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ def upload_files(data_root, data_dir, upload_func):
:param output_buckets:
:param upload_func: upload function taking three params(bucket_idx, key, filepath)
'''
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func):
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):

# upload model
model_name = 'resnet50-19c8e357.pth'
Expand Down
3 changes: 2 additions & 1 deletion benchmarks/500.scientific/501.graph-pagerank/config.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{
"timeout": 120,
"memory": 512,
"languages": ["python"]
"languages": ["python"],
"modules": []
}
5 changes: 1 addition & 4 deletions benchmarks/500.scientific/501.graph-pagerank/input.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,5 @@
'large': 100000
}

def buckets_count():
return (0, 0)

def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func):
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):
return { 'size': size_generators[size] }
3 changes: 2 additions & 1 deletion benchmarks/500.scientific/502.graph-mst/config.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{
"timeout": 120,
"memory": 512,
"languages": ["python"]
"languages": ["python"],
"modules": []
}
5 changes: 1 addition & 4 deletions benchmarks/500.scientific/502.graph-mst/input.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,5 @@
'large': 100000
}

def buckets_count():
return (0, 0)

def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func):
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):
return { 'size': size_generators[size] }
3 changes: 2 additions & 1 deletion benchmarks/500.scientific/503.graph-bfs/config.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{
"timeout": 120,
"memory": 512,
"languages": ["python"]
"languages": ["python"],
"modules": []
}
5 changes: 1 addition & 4 deletions benchmarks/500.scientific/503.graph-bfs/input.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,5 @@
'large': 100000
}

def buckets_count():
return (0, 0)

def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func):
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):
return { 'size': size_generators[size] }
3 changes: 2 additions & 1 deletion benchmarks/500.scientific/504.dna-visualisation/config.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{
"timeout": 60,
"memory": 2048,
"languages": ["python"]
"languages": ["python"],
"modules": ["storage"]
}
2 changes: 1 addition & 1 deletion benchmarks/500.scientific/504.dna-visualisation/input.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
def buckets_count():
return (1, 1)

def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func):
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):

for file in glob.glob(os.path.join(data_dir, '*.fasta')):
data = os.path.relpath(file, data_dir)
Expand Down
Loading