Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/derive_local_path #389

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Conversation

BrianWhitneyAI
Copy link
Contributor

@BrianWhitneyAI BrianWhitneyAI commented Jan 10, 2025

Description

The purpose of this PR is to resolve #390. When working with the ETL it became apparent that having an annotation with a unique value for each file puts a lot of stress on the annotation collection. We recently added local file path and Cache Eviction Date to this collection which caused a breakage of our annotation collection rebuild.

The solution we came up with is to remove the local path annotation and derive it from fms ID and show conditionally based on the presence of Cache Eviction Date. Additionally Cache Eviction Date will move to a bucketed date time, decreasing the unique values for files and making the annotation more palatable.

This PR also removes the previous backwards compatibility.

@BrianWhitneyAI
Copy link
Contributor Author

Philip brought up that we could derive from the Cloud path as well which would be much simpler. I opted against this idea since the cloud path is already derived from fms-ID. Open to a discussion here!

@BrianWhitneyAI BrianWhitneyAI marked this pull request as ready for review January 13, 2025 20:30
@@ -85,12 +87,69 @@ export default class FileDetail {
return FileDetail.convertAicsS3PathToHttpUrl(`${AICS_FMS_S3_BUCKET}${pathWithoutDrive}`);
}

private static generateFilePath(env: Environment, fileName: string, fmsId: string): string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These parameters are available on the object so by making it static you make more work for the caller and this method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved by deriving from cloud path in 3bee712

}

// Define prefixes based on the environment
const prefixes: Record<Environment, string> = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constant can live outside the declaration of this class (just right above where the class is declared)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in 3bee712

packages/core/entity/FileDetail/index.ts Outdated Show resolved Hide resolved
};

// Get the prefix for the given environment
const prefix = prefixes[env];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefix feels non-descript - perhaps "nasDirectory"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

envToPrefix ?

return filePath;
}

private static convertFMSIDToLocalPath(guid: string): string[] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has basically the same name as the above method with a pretty different purpose - this sounds more like divideFmsIdIntoSegments

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue where because this is static you had to pass an accessible class property - also guid conflicts with the name of the method I'd stick with fileId since its used across the repo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved by removing in 3bee712

}

private static convertFMSIDToLocalPath(guid: string): string[] {
if (!guid) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implicit as part of type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved by removing in 3bee712


const paths: string[] = [];

while (guid.length > 2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, but FWIW lodash can do this whole method in one line for ya:


    // Splits the fileId into an array where is character is its own element then chunks them into groups of 3 with the
    // remainder (between 1-3 characters) becoming the final group
    chunk(this.fileId.split(""), 3);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah I definitely just ported it over from FSS and didnt consider.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

less custom code to test FTW

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved by deriving from cloud path in 3bee712

private static convertAicsS3PathToHttpUrl(path: string): string {
return `${AICS_FMS_S3_URL_PREFIX}${path}`;
}

constructor(fileDetail: FmsFile, uniqueId?: string) {
constructor(fileDetail: FmsFile, env: Environment, uniqueId?: string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'll be cleaner long-term to keep this out of the properties and instead pass it to a getter method getLocalPath() - this would replace the property localPath (FWIW localPath seen below is a property because of the get before its method name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made getLocalPath as a function in 3bee712. However I left get localPath where it calls getLocaPath() since it has a conditional of Cache eviction date. Lmk what you think about this, open to just removing it and moving the conditional into getLocalPath()

@@ -31,6 +31,12 @@ export const TOP_LEVEL_FILE_ANNOTATIONS = [
description: "Path to file in the cloud.",
type: AnnotationType.STRING,
}),
new Annotation({

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is the pattern for dealing with file properties?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the comment is here but my interpretation is this format allows us to create our own annotations in BFF which we later propagate with the file properties.

@@ -85,12 +87,69 @@ export default class FileDetail {
return FileDetail.convertAicsS3PathToHttpUrl(`${AICS_FMS_S3_BUCKET}${pathWithoutDrive}`);
}

private static generateFilePath(env: Environment, fileName: string, fmsId: string): string {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does generateFilePath have test cov?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not any more.... removed in 3bee712

}

public get cloudPath(): string {
// REMOVE THIS (BACKWARDS COMPAT)
if (this.path.startsWith("/allen")) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need backward compatibility - should this change (and the block below) use the new heuristic this.getAnnotation("Cache Eviction Date")?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backwards compatibility is to handle file.path being '/allen' or the updated version where file.path = cloud path. I thought that we were fully past that after the new file rebuild, hence trying to remove it. however seems like there's still a possibility of file.path being /allen as talked about today in #cloud-first-fms so I am in favor of adding it back

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In regards to Cache Eviction Date I don't think that we should use this yet since it has not been backfilled and will only show up on new annotations. We would not reach a state where the file path property was old where we had an eviction date I don't think?

@@ -167,6 +167,19 @@ export default class HttpServiceBase {
return new RestServiceResponse(cachedResponseData);
}

public getEnvironmentFromUrl(): Environment {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has unit test cov?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed in 3bee712

Copy link

@hughes036 hughes036 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See questions about backward compatibility. Lgtm otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: Derive local path from fms ID and show conditionally.
4 participants