Skip to content

Commit

Permalink
[INLONG-11352][SDK] Add readme
Browse files Browse the repository at this point in the history
  • Loading branch information
justinwwhuang committed Oct 17, 2024
1 parent ceedd91 commit 81e93af
Showing 1 changed file with 47 additions and 0 deletions.
47 changes: 47 additions & 0 deletions inlong-sdk/dirty-data-sdk/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
## Overview

This SDK is used to collect dirty data and store it in a designated storage location.

## Features

### Independent SDK

Independent SDK, not dependent on platform specific libraries (such as Flink), can be used by Agent,
Data Proxy, Sort modules.

### Scalable multiple data storage options

Dirty data can be stored in various different storage locations (currently only supports sending to
DataProxy).

## Usage

### Create DirtyDataCollector object

```java
Map<String, String> configMap = new ConcurrentHashMap<>();
configMap.put(DIRTY_COLLECT_ENABLE, "true");
configMap.put(DIRTY_SIDE_OUTPUT_IGNORE_ERRORS, "true");
configMap.put(DIRTY_SIDE_OUTPUT_CONNECTOR, "inlong");
configMap.put(DIRTY_SIDE_OUTPUT_LABELS, "key1=value1&key2=value2");
configMap.put(DIRTY_SIDE_OUTPUT_LOG_TAG, "DirtyData");
Configure config = new Configure(configMap);

DirtyDataCollector collecter = new DirtyDataCollector();
collector.open(config);
```

### Collect dirty data

```java
// In fact, the dirty data we encounter is often parsed incorrectly,
// so we use byte [] as the format for dirty data.
byte[] dirtyData = "xxxxxxxxxyyyyyyyyyyyyyy".getBytes(StandardCharsets.UTF_8);
// Here, incorrect types can be marked, such as missing fields, type errors, or unknown errors, etc.
String dirtyType = "Undefined";
// Details of errors can be passed here.
Throwable error = new Throwable();
collector.invoke(dirtyData, dirtyType, error);
```

|

0 comments on commit 81e93af

Please sign in to comment.