Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

please how to call it locally #185

Open
NanshaNansha opened this issue Jul 4, 2024 · 2 comments
Open

please how to call it locally #185

NanshaNansha opened this issue Jul 4, 2024 · 2 comments

Comments

@NanshaNansha
Copy link

ae406fbe183e2be0bec4dc8fcf8e4f7

@Siddharth-Latthe-07
Copy link

@NanshaNansha To load a model locally using the PeftModel class from a pretrained model, you need to ensure that the base_model and other required files are available locally.
Try out these steps and let me know, if it works

  1. Install the dependencies:- pip install transformers peft
  2. Prepare Local Paths: Set the local paths where your pretrained model and cache directory are located.
    example snippet:-
from peft import PeftModel

# Define the local path to the base model and the cache directory
base_model_path = 'path/to/your/base_model'
model_id = 'path/to/your/local_model_directory'
cache_dir = 'path/to/your/cache_directory'

# Load the model from the local path
model = PeftModel.from_pretrained(base_model_path, model_id=model_id, cache_dir=cache_dir)

# Example: Use the model for inference
# Make sure you have the tokenizer and other necessary components
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(base_model_path)
input_text = "Your input text here"

inputs = tokenizer(input_text, return_tensors="pt")
outputs = model(**inputs)

print(outputs)

Let me know, if it works
Thanks

@allenlsy
Copy link

allenlsy commented Jan 2, 2025

I did make it work on my m4 MacBook Pro for most part of the demo. You will need to change some code and install packages, but it's doable.

The parts working are:

  1. Prepare training dataset from finnhub and convert to llama format
  2. Use the adapter in the repo on top of llama as the model to forecast a given stock.

The part is not working for me is: after step 1 above, I actually need to fine-tune llama to generate my own adapter, as the adapter inside this repo is around 1 year old. Looks like training step require cuda, which I don't have.

I tried to run it on the cloud with Nvidia card. I'm using around 1000 rows as training data set and 200 rows testing data set. The training is working also, but it's super slow. It requires around 13 hours of training.

And, for the training process, I was running the train.sh on a machine with 16gb v100 card. It says the ram is not big enough. Looks like PyTorch takes most of the ram. The training itself needs less than 200MB of ram.

[rank0]: torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 172.00 MiB. 
GPU 0 has a total capacity of 15.77 GiB of which 16.19 MiB is free. Including non-PyTorch memory, 
this process has 15.75 GiB memory in use. Of the allocated memory 15.45 GiB is allocated by PyTorch, 
and 1.74 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting 
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  
See documentation for Memory Management 
 (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

@BruceYanghy is this expected?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants