please how to call it locally #185

NanshaNansha · 2024-07-04T09:34:34Z

Siddharth-Latthe-07 · 2024-07-16T12:49:48Z

@NanshaNansha To load a model locally using the PeftModel class from a pretrained model, you need to ensure that the base_model and other required files are available locally.
Try out these steps and let me know, if it works

Install the dependencies:- pip install transformers peft
Prepare Local Paths: Set the local paths where your pretrained model and cache directory are located.
example snippet:-

from peft import PeftModel

# Define the local path to the base model and the cache directory
base_model_path = 'path/to/your/base_model'
model_id = 'path/to/your/local_model_directory'
cache_dir = 'path/to/your/cache_directory'

# Load the model from the local path
model = PeftModel.from_pretrained(base_model_path, model_id=model_id, cache_dir=cache_dir)

# Example: Use the model for inference
# Make sure you have the tokenizer and other necessary components
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(base_model_path)
input_text = "Your input text here"

inputs = tokenizer(input_text, return_tensors="pt")
outputs = model(**inputs)

print(outputs)

Let me know, if it works
Thanks

allenlsy · 2025-01-02T03:38:01Z

I did make it work on my m4 MacBook Pro for most part of the demo. You will need to change some code and install packages, but it's doable.

The parts working are:

Prepare training dataset from finnhub and convert to llama format
Use the adapter in the repo on top of llama as the model to forecast a given stock.

The part is not working for me is: after step 1 above, I actually need to fine-tune llama to generate my own adapter, as the adapter inside this repo is around 1 year old. Looks like training step require cuda, which I don't have.

I tried to run it on the cloud with Nvidia card. I'm using around 1000 rows as training data set and 200 rows testing data set. The training is working also, but it's super slow. It requires around 13 hours of training.

And, for the training process, I was running the train.sh on a machine with 16gb v100 card. It says the ram is not big enough. Looks like PyTorch takes most of the ram. The training itself needs less than 200MB of ram.

[rank0]: torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 172.00 MiB. 
GPU 0 has a total capacity of 15.77 GiB of which 16.19 MiB is free. Including non-PyTorch memory, 
this process has 15.75 GiB memory in use. Of the allocated memory 15.45 GiB is allocated by PyTorch, 
and 1.74 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting 
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  
See documentation for Memory Management 
 (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

@BruceYanghy is this expected?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

please how to call it locally #185

please how to call it locally #185

NanshaNansha commented Jul 4, 2024

Siddharth-Latthe-07 commented Jul 16, 2024

allenlsy commented Jan 2, 2025 •

edited

Loading

please how to call it locally #185

please how to call it locally #185

Comments

NanshaNansha commented Jul 4, 2024

Siddharth-Latthe-07 commented Jul 16, 2024

allenlsy commented Jan 2, 2025 • edited Loading

allenlsy commented Jan 2, 2025 •

edited

Loading