The only way to do it is by doing a complex system on top of sprites, polygons will be the easiest and better performant way.
To do that you will need to pick every single pixel on the image data, translate to world position, and compare against the position of every single pixel on the other image.
Usually, this is not needed in games (and less with animated things), it do not offer anything to the players and perfection can be annoying to them too (plus the low performance of that kind of check), is generally better to do rough polygon or simple boxes, always favoring the gameplay, and work with that.
Bodies and areas can have many shapes, and use some of these shapes can be related to the animations, for shape comparison when the bigger box of the 2 elements is overlapping.