Drone localization is essential for various purposes such as navigation, autonomous flight, and object tracking. However, this task is challenging when satellite signals are unavailable. This paper addresses vision-only localization of flying drones through op-timal window velocity fusion. Multiple optimal windows are derived from a piecewise linear regression (segment) model of the image-to-real world conversion function. Each window serves as a template to estimate the drone's instantaneous velocity. The multiple velocities obtained from multiple optimal windows are integrated by two fusion rules: one is a weighted average for lateral velocity, and the other is a winner-take-all decision for longitudinal velocity. In the experiments, a drone performed a total of six short-range (about 800 m to 2 km) and high maneuvering flights in rural and urban areas. Four flights in rural areas consist of a forward-backward straight flight, a forward-backward zigzag flight (a snake path), a square path with three banked turns, and a free flight that includes both banked turns and zigzags. Two flights in urban areas are a straight outbound flight and a forward-backward straight flight. The performance was evaluated through the root mean squared error (RMSE) and drift error of the ground-truth trajectory and the rig-id-body rotated vision-only trajectory. The proposed image-based method has been shown to achieve flight errors of a few meters to tens of meters, which corresponds to around 3% of the flight length.