C++: Dynamic Programming - Maximizing Gold : Cooperative Play and Known Pots

1. Analysis
In the last two blog we have discussed competitive play with unknown pots, cooperative play with unknown pots and competitive play with known pots. In this blog I am going to discuss cooperative play with known pots.

Comparing with competitive play with known pots the difference is on the second player's strategy. In competitive play with known pots the second player is to maximize the gold as the first player. However in cooperative play with known pots the second player is to minimize the gold in order to let the first player maximize the gold. In another word in competitive play both players have to bring up their best game, however in cooperative play the first player has to play the best and the second player has to play the worst.

2. DP's Sub-problem
As we discussed above, the strategy for the first player is to maximize the gold and the strategy for the second player is to minimize the gold. Or in other word the second player is to pick the gold pot in the first player's favor which allows the first player to maximize the gold.

Given a serial gold pots (P1, P2, ..., Pn), and at any stage of play F(Pi, ..., Pj), where i, j are within [1, n] and i is not greater than j.
For the first player:
- Pick the left pot, then the return is Pi + F(Pi+1, ..., Pj)
- Pick the right pot, then the return is Pj + F(Pi, ..., Pj-1)
- And take the bigger return
For the first player:
- Pick the left pot, then the return for the first player is F(Pi+1, ..., Pj)
- Pick the right pot, then the return for the first player is F(Pi, ..., Pj-1)
- Pick the pot that result in a bigger return for the first player

3. C++ Implementation
As competitive play with known pots, the caching technique is used to save the duplicate calculation as well.

// ********************************************************************************
// IMPLEMENTATION
// ********************************************************************************
#include <unordered_map>
#include <vector>

struct GoldPotKey {
size_t startIndex;
size_t endIndex;

GoldPotKey(size_t s, size_t e)
: startIndex(s), endIndex(e)
{}

bool operator==(const GoldPotKey& rhs) const {
return startIndex == rhs.startIndex &&
endIndex == rhs.endIndex;
}
};

struct GoldPotKeyHash {
size_t operator()(const GoldPotKey& key) const {
return std::hash<double>()(key.startIndex) ^ std::hash<size_t>()(key.endIndex);
}

bool operator()(const GoldPotKey& k1, const GoldPotKey& k2) const {
return k1 == k2;
}
};

typedef size_t REWARD;
typedef std::unordered_map<GoldPotKey, REWARD, GoldPotKeyHash, GoldPotKeyHash>
GoldPotRewardHashMap;

size_t MaxGoldCooperativeAndKnown_DP(const std::vector<size_t>& goldPots,

const size_t startIndex,

const size_t endIndex,

const bool isFirstPlayer,

GoldPotRewardHashMap& gprhm)

{

if (startIndex == (endIndex - 1)) {

if (isFirstPlayer) {

return goldPots[startIndex];

}

return 0;

}

const GoldPotKey gpk{ startIndex, endIndex };

{

GoldPotRewardHashMap::const_iterator foundIter = gprhm.find(gpk);

if (foundIter != gprhm.end()) {

return foundIter->second;

}

size_t returnOfOpponentIfTakeTheLeftPot;

size_t returnOfOpponentIfTakeTheRightPot;

if (isFirstPlayer)

{

returnOfOpponentIfTakeTheLeftPot = goldPots[startIndex] +

MaxGoldCooperativeAndKnown_DP(goldPots,

startIndex + 1, endIndex, false, gprhm);

returnOfOpponentIfTakeTheRightPot = goldPots[endIndex - 1] +

MaxGoldCooperativeAndKnown_DP(goldPots,

startIndex, endIndex - 1, false, gprhm);

}

else {

returnOfOpponentIfTakeTheLeftPot =

MaxGoldCooperativeAndKnown_DP(goldPots, startIndex + 1,

endIndex, true, gprhm);

returnOfOpponentIfTakeTheRightPot =

MaxGoldCooperativeAndKnown_DP(goldPots,

startIndex, endIndex - 1, true, gprhm);

}

// the second player tries to maximize his/her gold, and take the bigger

// return and the left is for the first player

size_t expectedReward =

returnOfOpponentIfTakeTheLeftPot > returnOfOpponentIfTakeTheRightPot ?

returnOfOpponentIfTakeTheLeftPot : returnOfOpponentIfTakeTheRightPot;

gprhm.insert(std::make_pair(gpk, expectedReward));

return expectedReward;

}

size_t MaxGoldCooperativeAndKnown(const std::vector<size_t>& goldPots)

{

if (goldPots.empty()) {

return 0;

}

size_t totalPots = goldPots.size();

if (totalPots == 1) {

return goldPots[0];

}

GoldPotRewardHashMap gprhm;

return MaxGoldCooperativeAndKnown_DP(goldPots, 0, goldPots.size(), true, gprhm);

}

// ********************************************************************************

// TEST

// ********************************************************************************

#include <cassert>

void TestCases()

{

std::vector<size_t> goldPots = { 1, 2 };

assert(MaxGoldCooperativeAndKnown(goldPots) == 2);

goldPots = { 1, 3, 2 };

assert(MaxGoldCooperativeAndKnown(goldPots) == 5);

goldPots = { 1, 2, 10, 3 };

assert(MaxGoldCooperativeAndKnown(goldPots) == 13);

goldPots = { 4, 1, 2, 10 };

assert(MaxGoldCooperativeAndKnown(goldPots) == 14);

goldPots = { 4, 1, 2, 10, 3 };

assert(MaxGoldCooperativeAndKnown(goldPots) == 17);

goldPots = { 5, 4, 1, 2, 10 };

assert(MaxGoldCooperativeAndKnown(goldPots) == 19);

goldPots = { 5, 4, 1, 2, 10, 3 };

assert(MaxGoldCooperativeAndKnown(goldPots) == 19);

}

// ********************************************************************************

C++

Sunday 14 December 2014

Dynamic Programming - Maximizing Gold : Cooperative Play and Known Pots

No comments:

Post a Comment

Blog Archive