-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: RuntimeError: Fastdup execution failed #361
Comments
Even when simplified as below code, still getting same error message np.save("./input_dir/img_embds_numpy.npy", imgs_embs_array)
import fastdup
fd = fastdup.create(work_dir="work_dir/", input_dir="input_dir/")
fd.run() NoneType: None
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[66], line 4
2 import fastdup
3 fd = fastdup.create(work_dir="work_dir/", input_dir="input_dir/")
----> 4 fd.run()
File /opt/conda/lib/python3.10/site-packages/fastdup/engine.py:157, in Fastdup.run(self, input_dir, annotations, embeddings, subset, data_type, overwrite, model_path, distance, nearest_neighbors_k, threshold, outlier_percentile, num_threads, num_images, verbose, license, high_accuracy, cc_threshold, **kwargs)
154 fastdup_func_params['model_path'] = model_path
155 fastdup_func_params.update(kwargs)
--> 157 return super().run(annotations=annotations, input_dir=input_dir, subset=subset, data_type=data_type,
158 overwrite=overwrite, embeddings=embeddings, **fastdup_func_params)
File /opt/conda/lib/python3.10/site-packages/fastdup/sentry.py:146, in v1_sentry_handler.<locals>.inner_function(*args, **kwargs)
144 else:
145 fastdup_capture_exception(f"V1:{func.__name__}", ex)
--> 146 raise ex
148 except Exception as ex:
149 fastdup_capture_exception(f"V1:{func.__name__}", ex)
File /opt/conda/lib/python3.10/site-packages/fastdup/sentry.py:137, in v1_sentry_handler.<locals>.inner_function(*args, **kwargs)
135 try:
136 start_time = time.time()
--> 137 ret = func(*args, **kwargs)
138 fastdup_performance_capture(f"V1:{func.__name__}", start_time)
139 return ret
File /opt/conda/lib/python3.10/site-packages/fastdup/fastdup_controller.py:618, in FastdupController.run(self, input_dir, annotations, subset, embeddings, data_type, overwrite, print_summary, print_vl_datasets_ref, run_explore, dataset_name, verbose, run_fast, **fastdup_kwargs)
616 if not run_fast:
617 if fastdup.run(fastdup_input, work_dir=str(self._work_dir), logger=self._logger, **fastdup_kwargs) != 0:
--> 618 raise RuntimeError('Fastdup execution failed')
620 # post process - map fastdup-id to image (for bbox this is done in self._set_fastdup_input)
621 if self._dtype == FD.IMG or self._run_mode == FD.MODE_CROP:
RuntimeError: Fastdup execution failed |
Hello @rapidcrawler Example for loading the feature is here:
Please try it out and let us know if this works. |
Thanks @dbickson, it's working now. The general idea helped. Since I don't have direct access to images as of now, just image-embeddings, thus couldn't use save_binary_feature But, passing available embeddings via from fastdup.engine import Fastdup
fd = Fastdup(input_dir="/")
fd.run(embeddings=np.array(embs)
, annotations=annotations_df
, overwrite=True) |
BTW @dbickson , any reason why the library returns error if I pass more than 5k embeddings at a time? I.e. below code has slicer at top 5k, and it gets successfully executed and returns the answer as per expectations start = dt.now()
from fastdup.engine import Fastdup
fd = Fastdup(input_dir="/")
fd.run(embeddings=np.array(embs)[:5000]
, annotations=annotations_df.head(5000)
, overwrite=True
, verbose=True)
df_sim = fd.similarity()
end = dt.now()
df_sim However, If I increase the slicer index to 10k or 6k, it is returning below error message about start = dt.now()
from fastdup.engine import Fastdup
fd = Fastdup(input_dir="/")
fd.run(embeddings=np.array(embs)[:6000]
, annotations=annotations_df.head(6000)
, overwrite=True
, verbose=True)
df_sim = fd.similarity()
end = dt.now()
df_sim fastdup By Visual Layer, Inc. 2024. All rights reserved.
A fastdup dataset object was created!
Input directory is set to "/"
Work directory is set to "work_dir"
The next steps are:
1. Analyze your dataset with the .run() function of the dataset object
2. Interactively explore your data on your local machine with the .explore() function of the dataset object
For more information, use help(fastdup) or check our documentation https://docs.visual-layer.com/docs/getting-started-with-fastdup.
2025-01-07 18:05:25 [FATAL] Failed to read any features
NoneType: None
fastdup C++ error received: 2025-01-07 18:05:25 [FATAL] Failed to read any features
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[42], line 7
2 from fastdup.engine import Fastdup
4 fd = Fastdup(input_dir="/")
----> 7 fd.run(embeddings=np.array(embs)[:6000]
8 , annotations=annotations_df.head(6000)
9 , overwrite=True)
11 df_sim = fd.similarity()
12 end = dt.now()
File /opt/conda/lib/python3.10/site-packages/fastdup/engine.py:157, in Fastdup.run(self, input_dir, annotations, embeddings, subset, data_type, overwrite, model_path, distance, nearest_neighbors_k, threshold, outlier_percentile, num_threads, num_images, verbose, license, high_accuracy, cc_threshold, **kwargs)
154 fastdup_func_params['model_path'] = model_path
155 fastdup_func_params.update(kwargs)
--> 157 return super().run(annotations=annotations, input_dir=input_dir, subset=subset, data_type=data_type,
158 overwrite=overwrite, embeddings=embeddings, **fastdup_func_params)
File /opt/conda/lib/python3.10/site-packages/fastdup/sentry.py:146, in v1_sentry_handler.<locals>.inner_function(*args, **kwargs)
144 else:
145 fastdup_capture_exception(f"V1:{func.__name__}", ex)
--> 146 raise ex
148 except Exception as ex:
149 fastdup_capture_exception(f"V1:{func.__name__}", ex)
File /opt/conda/lib/python3.10/site-packages/fastdup/sentry.py:137, in v1_sentry_handler.<locals>.inner_function(*args, **kwargs)
135 try:
136 start_time = time.time()
--> 137 ret = func(*args, **kwargs)
138 fastdup_performance_capture(f"V1:{func.__name__}", start_time)
139 return ret
File /opt/conda/lib/python3.10/site-packages/fastdup/fastdup_controller.py:618, in FastdupController.run(self, input_dir, annotations, subset, embeddings, data_type, overwrite, print_summary, print_vl_datasets_ref, run_explore, dataset_name, verbose, run_fast, **fastdup_kwargs)
616 if not run_fast:
617 if fastdup.run(fastdup_input, work_dir=str(self._work_dir), logger=self._logger, **fastdup_kwargs) != 0:
--> 618 raise RuntimeError('Fastdup execution failed')
620 # post process - map fastdup-id to image (for bbox this is done in self._set_fastdup_input)
621 if self._dtype == FD.IMG or self._run_mode == FD.MODE_CROP:
RuntimeError: Fastdup execution failed
|
Today it's running if embeddings are ~ 500 rows. Anything more than 500 embeds is throwing |
Hi @rapidcrawler this is weird. Can you run() with verbose=1 so we can see what is the issue. Alternatively, you can use v0.2 API namely: Let us know if this worked for you. |
What happened?
imgs_embs_array is numpy array of image embeddings
What did you expect to see?
No response
What version of fastdup were you runnning on?
2.14
What version of Python were you running on?
Python 3.10
Operating System
[GCC 13.3.0]
Reproduction steps
No response
Relevant log output
No response
Attach a screenshot [Optional]
Contact Details [Optional]
[email protected]
The text was updated successfully, but these errors were encountered: