3DGS itself supports training on Blender datasets, with the main data format being:
<location>
|---images
| |---<image 0>
| |---<image 1>
| |---...
|---transformers_train.json
|---point3d.ply
Data Preparation#
Typically, the datasets we collect ourselves come from laser scanning devices, providing las format point cloud files, json format camera poses, and images.
First, process the las format point cloud files, converting them into binary ply files.
Then, check the transformers_train.json file, mainly to specifically modify the 3dgs code file located at /sence/dataset_readers.py.
def readNerfSyntheticInfo(path, white_background, eval, extension=".jpg"):
print("Reading Training Transforms")
train_cam_infos = readCamerasFromTransforms(path, "transforms_train.json", white_background, extension)
#print("Reading Test Transforms")
#test_cam_infos = readCamerasFromTransforms(path, "transforms_test.json", white_background, extension)
#if not eval:
# train_cam_infos.extend(test_cam_infos)
# test_cam_infos = []
test_cam_infos = []
nerf_normalization = getNerfppNorm(train_cam_infos)
ply_path = os.path.join(path, "points3d.ply")
if not os.path.exists(ply_path):
# Since this data set has no colmap data, we start with random points
num_pts = 100_000
print(f"Generating random point cloud ({num_pts})...")
# We create random points inside the bounds of the synthetic Blender scenes
xyz = np.random.random((num_pts, 3)) * 2.6 - 1.3
shs = np.random.random((num_pts, 3)) / 255.0
pcd = BasicPointCloud(points=xyz, colors=SH2RGB(shs), normals=np.zeros((num_pts, 3)))
storePly(ply_path, xyz, SH2RGB(shs) * 255)
try:
pcd = fetchPly(ply_path)
except:
pcd = None
scene_info = SceneInfo(point_cloud=pcd,
train_cameras=train_cam_infos,
test_cameras=test_cam_infos,
nerf_normalization=nerf_normalization,
ply_path=ply_path)
return scene_info
The Test Transforms related parts are commented out, and since eval will not be performed, test is not needed. The extension has been modified, mainly to check the input format of your images.
def readCamerasFromTransforms(path, transformsfile, white_background, extension=".jpg"):
cam_infos = []
with open(os.path.join(path, transformsfile)) as json_file:
contents = json.load(json_file)
fovx = contents["camera_angle_x"]
frames = contents["frames"]
for idx, frame in enumerate(frames):
cam_name = os.path.join(path,"images", frame["file_path"])
# NeRF 'transform_matrix' is a camera-to-world transform
c2w = np.array(frame["transform_matrix"])
# change from OpenGL/Blender camera axes (Y up, Z back) to COLMAP (Y down, Z forward)
c2w[:3, 1:3] *= -1
# get the world-to-camera transform and set R, T
w2c = np.linalg.inv(c2w)
R = np.transpose(w2c[:3,:3]) # R is stored transposed due to 'glm' in CUDA code
T = w2c[:3, 3]
image_path = os.path.join(path, cam_name)
image_name = Path(cam_name).stem
image = Image.open(image_path)
im_data = np.array(image.convert("RGBA"))
bg = np.array([1,1,1]) if white_background else np.array([0, 0, 0])
norm_data = im_data / 255.0
arr = norm_data[:,:,:3] * norm_data[:, :, 3:4] + bg * (1 - norm_data[:, :, 3:4])
image = Image.fromarray(np.array(arr*255.0, dtype=np.byte), "RGB")
fovy = focal2fov(fov2focal(fovx, image.size[0]), image.size[1])
FovY = fovy
FovX = fovx
cam_infos.append(CameraInfo(uid=idx, R=R, T=T, FovY=FovY, FovX=FovX, image=image,
image_path=image_path, image_name=image_name, width=image.size[0], height=image.size[1]))
return cam_infos
Here, changes are usually made based on the content of your json file, with a few key points to note:
fovx = contents["camera_angle_x"]
cam_name = os.path.join(path,"images", frame["file_path"])
def fetchPly(path):
plydata = PlyData.read(path)
vertices = plydata['vertex']
positions = np.vstack([vertices['x'], vertices['y'], vertices['z']]).T
colors = np.vstack([vertices['red'], vertices['green'], vertices['blue']]).T / 255.0
normals = np.vstack([0,0,0]).T
return BasicPointCloud(points=positions, colors=colors, normals=normals)
This part is related to the ply file, and typically self-collected datasets do not have point cloud normals, so we set them to 0 based on the colmap part of the code.
Generally, after making these changes, the code should run smoothly, allowing laser point clouds combined with GPS high-precision camera poses to solve the problem of colmap being unable to obtain camera poses.