The paper deals with the problem of a mental image formation and structure, which are specific for each naive linguistic world view. The analysis is based on a series of experiments that study mental images of "zayats" and "krolik" in Russian and "hare" and "rabbit" in English linguistic world views. The experiments involved 22 Russian-speaking adults and 17 English-speaking adults as participants. The same experiment was performed with children at the age of 4-7, and analysis of these data has shown the way of a mental image formation. The analysis conducted has proved that the understanding and structure of mental images as well as their recognition depends on cultural and linguistic information, especially on personal cultural and linguistic experience of a speaker, while the core of any mental image includes encyclopedic knowledge, which is stable and basic. Acquisition of two close mental images follows several levels, and children start differentiating them according to their own sense experience at the age of 5. At the same time, cultural and linguistic information of a mental image is acquired after the age of 6. The comparative analysis of the mental images of "zayats" and "krolik" in Russian and "hare" and "rabbit" in English has shown that the mental image of "rabbit" corresponds to both "zayats" and "krolik" in Russian.