@EtienneT
One difference to consider is the setting of your problem - if it is generative vs. discriminative. If you are learning a distribution of the data, then there is technically no need for a validation data set; because you will be partitioning the data you are given to create train-validation data set. You could use all of the samples to learn the distribution. And then simply generate test predictions. So the setting is to learn the distribution of your X:
Pr(X)
However, if it is a discriminative (classification) or regression setting, it would behoove you to have validation set.
Pr(Y|X)