Systems, robots, and methods for generating three-dimensional skeleton representations of people are disclosed. A method includes generating, from a two-dimensional image, a two-dimensional skeleton representation of a person present in the two-dimensional image. The two-dimensional skeleton representation includes a plurality of joints and a plurality of links between individual joints of the plurality of joints. The method further includes positioning a cone around one or more links of the plurality of links, and identifying points of a depth cloud that intersect with the cone positioned around the one or more links of the two-dimensional skeleton. The points of the depth cloud are generated by a depth sensor and each point provides depth information. The method also includes projecting the two-dimensional skeleton representation into three-dimensional space using the depth information of the points that intersect with the cone, thereby generating the three-dimensional skeleton representation of the person.