The image (shadow) being inverted depends on there being an aperture at the cross-over and the image (shadow) being distant. The explanation lies in the aperture.
The image (shadow): The light reaches the person shining like an arrow. The lowest [light] that reaches the person is the highest [in the image] and the highest [light] that reaches the person is the lowest [in the image]. The feet conceal the lowest light and therefore become the image (shadow) at the top. The head conceals the highest light and therefore becomes the image (shadow) at the bottom.
Book 10: Exposition of Canon II; this is the earliest known description of the inverted image produced by a camera obscura,; as translated in by Ian Jonston in The Mozi (2010), p. 489