Neighbourhood Consensus Networks · Neighbourhood Consensus Networks Introduction Proposed method...
Transcript of Neighbourhood Consensus Networks · Neighbourhood Consensus Networks Introduction Proposed method...
Neighbourhood Consensus Networks
Introduction Proposed method
Ignacio Rocco1,2 Josef Sivic1,2,3Mircea Cimpoi3 Akihiko Torii5 Tomas Pajdla3
1DI ENS, PSL Research University 3CIIRC, CTU in Prague 4DeepMind2INRIA 5Tokyo Institute of Technology
Feature extraction and matching Trainable match filtering
4-D conv. layers
4D space of raw match scores
Neighbourhood Consensus Network
4D filteredmatch scores
Soft mutual nearest-neighbours filtering Neighbourhood Consensus Network
is a 4D CNN
The filters of the first layer span
Task: find pixel-level visual correspondences
where
Extracting correspondences:
Experimental results
Contributions: - Trainable method for feature extraction, matching and filtering
- Based on dense CNN descriptors and a novel 4D neighbourhood consensus CNN
- Trainable without correspondence annotation
Challenges: strong illumination or appearance changes
day-night
matching
changes
across time
repetitive
structures
intra-class
variation
ratio-to-maxalong image A
ratio-to-maxalong image B
Very few correct matches:
Correct matches: Total matches:
Matches can be extracted in both directions from the output :
Training loss:
The network is trained with weak supervision:
The network is applied twice for invariance to pair order:
Normalized scores over :matching Normalized scores over :
Category-level matching: PF-Pascal dataset [1]
- Task: match similar semantic parts - Metric: percentage of correct keypoints (PCK)
Method PCK
HOG+PF-LOM [1] 62.5SCNet-AG+ [2] 72.2CNNGeo [3] 71.9WeakAlign [4] 75.8NC-Net 78.9
References
[1] B. Ham, M. Cho, C. Schmid, and J. Ponce. Proposal flow: Semantic correspondences from object proposals. IEEE PAMI, 2017.[2] K. Han, R. S. Rezende, B. Ham, K.-Y. K. Wong, M. Cho, C. Schmid, and J. Ponce. SCNet: Learning Semantic Correspondence. In Proc. ICCV, 2017.[3] I. Rocco, R. Arandjelović, and J. Sivic. Convolutional neural network architecture for geometric matching. In Proc. CVPR, 2017.[4] I. Rocco, R. Arandjelović, and J. Sivic. End-to-end weakly-supervised semantic alignment. In Proc. CVPR, 2018.[5] H. Taira, M. Okutomi, T. Sattler, M. Cimpoi, M. Pollefeys, J. Sivic, T. Pajdla, and A. Torii. InLoc: Indoor visual localization with dense matching and view synthesis. In
Proc. CVPR, 2018.
- Handcrafted sparse keypoints
- Trainable dense features - VGG/ResNet (e.g. conv4)
dense descriptors
sparsedescriptors
- Ratio test (1st NN/2nd NN)
- Nearest neighbours
- Mutual nearest-neighbours
- Global methods
- Semi-local methods
homography fitting using RANSAC
Neighbourhood consensus
match being analyzedneighbouring matches
x
Tentativematching
Matchfiltering
Inputimages
Featureextraction
x
Review - classical pipeline:
matching
Instance-level matching: InLoc dataset [5]
- Task: retrieve 6-dof camera poses of query images - Metric: percentage of correctly localized queries
0 0.25 0.5 0.75 1 1.25 1.5 1.75 20
20
40
60
80
Distance threshold [meters]
Corr
ectly
loca
lized
quer
ies
[%]
InLoc + NCNetInLoc + MNNInLoc [5]DensePE + NCNetDensePE + MNNDensePE [5]SparsePE [5]
Ours Ground-truth Ours Ground-truth
0.36m, 1.93◦ 8.11m, 48.38◦
0.41m, 0.83◦ 6.65m, 23.86◦
0.51m, 0.99◦ 9.36m, 23.72◦
Ours (InLoc+NCNet) Baseline (InLoc)
- positive pairs ( ): maximize match score - negative pairs ( ): minimize match score
score distributions
tend to Kronecker delta
score distributionstend to uniform
positive matches:
negative matches:
mean score of matches
mean score of matches