Maybe? If you just round up each "object" to 4k then you can implement this using the current PTE on x86_64, but this removes the (supposed) advantage of only requiring a single PTE for each object (or "object cache" lookup entry or whatever you want to call it) in the cases when an object spans multiple page-sizes of data.
Having arbitrary sizes objects will likely be possible in hardware - it's just an extra size being stored in the PTE if you can mask out the objectID from the address (in the example in the original post, it's a whole 64-bit object ID, allowing a full 64-bits of offset within each object, but totaling a HUGE 128bit effectively address)
But arbitrary sizes feels like it pushes the issues that many current userspace allocators have to deal with today to the hardware/microcode - namely about packing to cope with fragmentation and similar (only instead of virtual address space they'll have to deal with physical address space). The solutions to this today are certainly non-trivial and still can fail in many ways, so far away from being solved, let along solved in a simple enough way to be implemented that close to hardware.
Doesn't that imply the minimum-sized object requires 4K physical ram?
Is that a problem?