-
The code example is here: from torchgeo.datasets import stack_samples
from torch.utils.data import DataLoader
from torchgeo.samplers import RandomGeoSampler, RandomBatchGeoSampler
from torchgeo.datasets import ChesapeakeCVPR
root = '/scratch/local/cvpr_chesapeake_landcover/'
dataset_train = ChesapeakeCVPR(root, splits=['md-train'], layers=['naip-new', 'lc', 'nlcd'], download=False)
sampler_train = RandomGeoSampler(dataset_train, size=224, length=1)
loader_train = DataLoader(dataset_train, sampler=sampler_train, collate_fn=stack_samples)
for i, data in enumerate(loader_train):
images = data["image"]
targets = data["mask"]
print(images.shape, targets.shape)
break Then it shows the shape of images and targets:
which is not compatible with 224: what I specified in the code |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 10 replies
-
This is a result of the way
but has the following disadvantages as a result:
Personally, I vote we convert it to As a workaround, you can look at the Hope this helps! Want to convert this to an issue so I can assign @calebrob6? There should be a button on the right. |
Beta Was this translation helpful? Give feedback.
-
Hey @hfangcat, this issue is related to #278 and #409. The problem is that the ChesapeakeCVPR dataset is made up of large tiles from several different CRSs. The approach of RasterDataset is to choose some single CRS and reproject all data into that CRS, and have the samplers produce bboxes in that CRS -- resampling on-the-fly to ensure that everything is pixel-aligned. This is good for cases where your data is not pixel-aligned beforehand, but produces unnecessary (and significant) slowdowns when your data is already pixel-aligned. As ChesapeakeCVPR data is already pixel-aligned, the compromise I take is to resample the bboxes to each tiles local CRS (which is fast) then mask from each layer (also fast). So, even if you ask for a 224x224 meter bbox in EPSG:3857, this translates into variable sized bboxes in the CRSs of the tiles in the dataset (and will vary with latitude) |
Beta Was this translation helpful? Give feedback.
-
Hi all @calebrob6 @adamjstewart, I solved the problem by looking at the source code of dataset_train = ChesapeakeCVPR(root, splits=['md-train'], layers=['naip-new', 'lc'], \
transforms=_Transform(K.CenterCrop(224)))
dataset_val = ChesapeakeCVPR(root, splits=['wv-val'], layers=['naip-new', 'lc'], \
transforms=_Transform(K.CenterCrop(224)))
train_batch_sampler = RandomBatchGeoSampler(dataset_train, size=224*3, batch_size=64, length=1000)
val_batch_sampler = RandomBatchGeoSampler(dataset_val, size=224*3, batch_size=64, length=100)
loader_train = DataLoader(dataset_train, batch_sampler=train_batch_sampler, collate_fn=stack_samples, num_workers=0)
loader_val = DataLoader(dataset_val, batch_sampler=val_batch_sampler, collate_fn=stack_samples, num_workers=0) The Although the code is working now, I have two questions regarding the length and the crop function:
|
Beta Was this translation helpful? Give feedback.
Hey @hfangcat, this issue is related to #278 and #409. The problem is that the ChesapeakeCVPR dataset is made up of large tiles from several different CRSs. The approach of RasterDataset is to choose some single CRS and reproject all data into that CRS, and have the samplers produce bboxes in that CRS -- resampling on-the-fly to ensure that everything is pixel-aligned. This is good for cases where your data is not pixel-aligned beforehand, but produces unnecessary (and significant) slowdowns when your data is already pixel-aligned. As ChesapeakeCVPR data is already pixel-aligned, the compromise I take is to resample the bboxes to each tiles local CRS (which is fast) then mask from each la…