I also thought this, but learned that what I needed was an Alan Kay style "change of perspective".
The key for me was understanding the "wish" and "claim" concepts in Realtalk. In your example above, you would need to separate your "Dog" program into two programs: One that was simply a program claiming "I am a dog" and another that would encode dog behavior, something like "I wish that a bark sound would play when a dog is near a house" - then you'd leave it up to Realtalk to make that happen. Adding "Cat" behavior would mean adding two new programs: one to claim "I am a cat" and another "I wish that a meow sound would play when a cat is near a house". To make the "Dog" and "Cat" interact, you'd add a program that said "I wish that a growl sound would play when a dog is near a cat" and so on.
Another example that might help is how I learned this myself: I made a set of playing cards. What I ended up with was 52 pieces of card stock, each one had a program that was simply "Claim that I am card X" - and then I made separate programs the give those cards meaning. For example a program to give the cards a style would be something like "I wish that card 1 will have the Ace of Spaces printed on it, etc" and another program would be something like "I wish that the sum of all face values of the cards on this line is printed next to the line"
It took me several days to internalize this, but once I did things started getting fun pretty quickly. For example, I was able to make a program to "clone" a real world image onto a playing card by having Realtalk take a picture of a rectangle and then always project that image onto a particular card when it was face up. Because the cards just made claims about their identities, this let me separate the designs on the cards from the rules as well as add "training mode" programs to help teach basics. And they were all decoupled!