Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot find package 'tokenizers-linux-x64-musl' - Alpine support #1703

Open
PylotLight opened this issue Dec 14, 2024 · 2 comments
Open

Cannot find package 'tokenizers-linux-x64-musl' - Alpine support #1703

PylotLight opened this issue Dec 14, 2024 · 2 comments

Comments

@PylotLight
Copy link

PylotLight commented Dec 14, 2024

Creating another issue for tokenizers support on alpine:
error:

error: Cannot find package 'tokenizers-linux-x64-musl' from '/usr/src/app/node_modules/tokenizers/index.js'
Bun v1.1.38 (Linux x64 baseline)

/usr/src/app # ./mycli 
155 |         if (isMusl()) {
156 |           localFileExisted = existsSync(join(__dirname, "tokenizers.linux-x64-musl.node"));
157 |           try {
158 |             if (localFileExisted) {
159 |               nativeBinding = (()=>{throw new Error("Cannot require module "+"./tokenizers.linux-x64-musl.node");})();
160 |               nativeBinding = (()=>{throw new Error("Cannot require module "+"tokenizers-linux-x64-musl");})();
                                                ^
error: Cannot require module tokenizers-linux-x64-musl
      at /$bunfs/root/mycli:160:43
      at /$bunfs/root/mycli:160:109

tokenizers.js:

import { Tokenizer } from "tokenizers";
const tokenizer = await Tokenizer.fromFile("tokenizer.json");
const wpEncoded = await tokenizer.encode("Who is John?");

Dockerfile:

FROM oven/bun:alpine AS base
WORKDIR /usr/src/app


FROM base AS install
RUN mkdir -p /temp/dev
COPY package.json bun.lockb /temp/dev/
RUN cd /temp/dev && bun install --frozen-lockfile

# install with --production (exclude devDependencies)
RUN mkdir -p /temp/prod
COPY package.json bun.lockb /temp/prod/
RUN cd /temp/prod && bun install --frozen-lockfile --production


# copy node_modules from temp directory
# then copy all (non-ignored) project files into the image
FROM base AS prerelease
COPY --from=install /temp/dev/node_modules node_modules
COPY . .

# copy production dependencies and source code into final image
FROM base AS release
# RUN apk add --no-cache gcompat python3 make gcc g++ glibc-2.35-r1.apk wget
RUN apk add --no-cache \
    gcompat \
    libc6-compat \
    python3 \
    make \
    gcc \
    g++ \
    bash \
    libstdc++ \
    musl-dev \
    wget
    
# RUN wget -q -O /etc/apk/keys/sgerrand.rsa.pub https://alpine-pkgs.sgerrand.com/sgerrand.rsa.pub && \
#     wget -q https://github.com/sgerrand/alpine-pkg-glibc/releases/download/2.35-r0/glibc-2.35-r0.apk && \
#     apk add glibc-2.35-r0.apk && \
#     rm glibc-2.35-r0.apk

COPY --from=install /temp/prod/node_modules node_modules
COPY --from=prerelease /usr/src/app/*.js .
COPY --from=prerelease /usr/src/app/package.json .
# RUN ldd /usr/src/app/node_modules/onnxruntime-node/bin/napi-v3/linux/x64/libonnxruntime.so.1.14.0
# run the app
USER bun
# EXPOSE 3000/tcp
ENTRYPOINT [ "bun", "run", "tokenizers.js" ]
@Narsil
Copy link
Collaborator

Narsil commented Jan 9, 2025

What version of tokenizers are you referring to ? We haven't uploaded tokenizers.js on NPM in a loooong while (we did rewrite everything with napi, but frankly it seems the work to maintain the JS branch wasn't worth it).

Cheers.

@PylotLight
Copy link
Author

Ah right I see, you have a point.
It does seem like I was testing with the old "tokenizers": "^0.13.3" -
https://github.com/huggingface/tokenizers/tree/main/bindings/node here.

Might have to revisit the issue as I still havent' got a working onnx/tokenizer js lib working on alpine outside of using bun to compile a cli bin. So please do let me know if there's a working embedding service working on alpine at all.

But given this issue was filed against an older version, may have no choice but to close it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants